


THE JOURNAL OF 
EDUCATIONAL PSYCHOLOGY 


Volume XIV December, 1923 Number 9 














ON THE IMPROVEMENT IN INTELLIGENCE 
SCORES FROM FOURTEEN TO EIGHTEEN! 


E. L. THORNDIKE 
Institute of Educational Research, Teachers College, Columbia University 


Adults measured by the intelligence tests in common use attain 
scores little if any above those made by 14-year-old children in school; 
and this has given rise to the conclusion that in general intelligence 
does not improve after about that age. On the other hand, whenever 
repeated measurements have been made over an interval of a year or 
more upon the same individuals initially 14 or 15, there has been a 
marked improvement. So in the case of Woolley (’14 and ’15), Brooks 
(’21), Cobb (’22), and others. The data from repeated measurements 
need, however, some allowance for the special practice in taking the 
tests themselves, and the amount of this has not been known with 
surety. 

We have been able to make very extensive measurements a year 
apart and to measure the allowance to be made for the special practice 
adequately, in the case of the sort of person who goes to high school and 
stays there for at least a year and a half. 

In May, 1922, 4473 pupils in Grade [X43 04 in Grade X and 3544 in 
Grade XI in various high schools were tested with an examination 
representing a composite of recognized group tests of intelligence. 
InjMay, 1923 the pupils in Grades X, XI, and XII of the same schools 
were tested with an alternate form of the same examination. 2790 + 
3136 + 2638 of the original 4473 + 4304 + 3544 were among those 
measured in 1923, representing (except for temporary absences or 
removals to other cities) those who had continued in school a year. 

Schools 4-15 had Form A of the examination in 1922 and Form B 
in 1923; schools 21-25 had Form B in 1922 and Form A in 1923. In 


1The investigation reported here was made possible by a grant from the 
Commonwealth Fund. 





513 





: 


t 
ts] 
if 
‘ 
J 


qos 


> eee aa 


= Se SE ye ’ 
ae ei ad Tepe. ee ie Soe aes 
ee : 


> J ~ 
OPE eS ES is : 


. er eee? 
4 


earn + 


tee hey P ae 3 
~ Nn - ai + >" At - 
Cn . ~ > 24 
3 rae a, . wr 


——— 


ir GF tn - 
am, 


a nals 


* “i 7 ‘ 


~~ aS 
es 


al 
SS 


— 


fie Bocas 


oat 


7 
= 
td > 


PRE pote 
m* as 


es err Sit a a oe xX ">: Send ; oo _ 
e — a i et a 
ee Ss cee 


ro  . 


ah 
J 


— a. 


ee i ee” Fe ae 


AS ae ne 


——————e ll Oe ee 


} 
; Me 


514 The Journal of Educational Psychology 


general the median gain for ’23 over ’22 for pupils who took both exam- 
inations is 26.4 + .48 (PE) when the order was AB and 19.8 + .66 
(PE) when the order was BA. If we assume that the real gains of 
pupils in schools 21-25 were equal to the real gains of pupils in schools 
4-15, B is 3.3 points (+.41) easier than A. 

An independent method of equating them is by comparing the 
scores of pupils taking Examination A and Examination B, in both 
cases as the first trial. In 1923 in schools 4 to 15 there were many 
pupils tested with Examination B in Grades XI and XII who by reason 
of absence or change of residence had not taken Form A, the 1922 
examination. On the average we may expect the Grade X pupils 
who thus took Examination B as their first trial in 1923 to be of nearly 
equal intellectual ability! with the Grade X pupils in 1922 who took 
Examination A. The data available appear in Table I. Weighting 
the B—A differences by the smaller of the two populations involved we 
have 1% as the average. Weighting by the square root of the smaller 
of the two populations involved we have 1.1 as the average. We 


2.25 + 2 : : 
may take - = 4 or 1.7 as a conservative probable error for it. 





TaBLE I.—MeEp1an Scores iN Form A AND Form B or Pupms TAKING THE 
EXAMINATION AS A First TRIAL 








School Grade Form A n Form B n B-A 
9 IX 15044 large 151 153 le 
ll IX 169 94 170 98 1 
4 to 11 x 178 large 176 104 —2 
4 to ll XI 195 large 20014 85 5% 
12 x 194 large 212 131 18 
12 XI 205 large 199 86 —6 
13 y 189 large 197 59 8 
13 XI 20416 large 200 56 —4he 
14 x 175 large 163 78 -12 
14 XI 182 large 185 53 3 
15 x 196 large 185 38 —11 
15 XI 206% large 165 20 —31% 
21 to 25 IX 154 173 16644 large 12% 
21 to 25 xX 190 744 1884 large -1\% 
21 to 25 XI 193 267 199 large 6 























1 Moving from one city to another is probably indifferent to intellect; being 
absent from school is perhaps slightly negatively related to intellect. 





















Ay 


SS ee eee 


ng 


Improvement in Intelligence Scores 515 


B then may be taken to be about 24 points easier than A. 
Subtracting 244 from each individual’s gain who took the tests in the 
order A—B and adding 2) to the gain of each individual who used 
the order B—A we have 22)4 as the median gain and 23 as the average 
gain. 

A certain amount of the gain made during the year is probably 
due to the special practice with the examination itself, although 
fore-exercise was given in both ’22 and ’23 to reduce this practice 
effect. We have measured it as follows: We have, as just stated, 
many pupils in Grades X, XI and XII in 1923 in schools 4-15 who 
took Examination B as their first trial, by reason of absence in 1922. 
They are presumably on the average nearly equal in intellect to those 
who were present both in ’22 and ’23 and so took Examination B as 
their second trial. The average difference in favor of the latter may 
then be taken as the result of the first trial’s experience. Similarly 
for schools 21-25 with Examination A. We have made the 
computations with the results shown in Table II. 

To have the amount due to the growth and training from May 22 
to May 23 without this special practice effect of second over first 
trial we then subtract 11.9 from 22% or 23, leaving 10.6 or 11.1. 


TaBLE IJ].—MEDIAN Scores OF Pupits TAKING THE SAME FORM OF THE 
EXAMINATION AS First TRIAL AND as SECOND TRIAL 


























Advantage 
Schools Grade Form First trial Second of second 
trial , 
trial 
4-11 y 4 B 176.0 184.6 8.6 
4-11 XI B 200.4 215.0 14.6 
4-11 XII B 207.2 222.8 15.6 
12-15 xX B 194.6 208.3 13.7 
12-51 XI B 195.3 210.3 15.0 
12-5 XII B 215.5+ 203.8 — 14.2 
21-25 x A 190.3 190.0 | —.3 
21-25 XI A 193.3 213.5 20.2 
21-25 XII A 213.7 219.3 5.6 





Average, 11.9 + PE 1 




















say ol 
2 
ea ete Pee 


Ti dpe Ge 
Ne Nn ee 
- aaa a 


Sdn oe 


Ss mem 





oO mem —— 
a oF me bm 


*, -—_ sb —e. aes ie bath eh a 
ee ae oT ; 
— E 


516 The Journal of Educational Psychology 


What this average gain of 11.1 in a year amounts to may be realized 
from the fact that it is about one-third of the mean square deviation 
of individuals in Grades IX, X, and XI, in first-trial score with the 
examination or about one-half of what their mean square deviation 
would be if they were perfectly measured. Since the variability of 
the high school population in Grades IX to XI may be estimated to 
be at least half that of all 14-year-olds, and since the mean square 
deviation of all 14-year-olds may be estimated as 23 months of mental 
age, measured by Stanford Binet or about 21 months if perfectly 
measured, our gain may be set as equivalent to at least 10 months of 
mental age around 14. It is thus a gain of considerable magnitude. 

The gain is very closely the same for pupils in Grade IX, in Grade 
X, and in Grade XI, in 1922, the medians being 22.4, 23.6 and 23.4 for 
the entire gain, or 10.5, 11.7, and 11.5 if 11.9 is subtracted from each 
for the effect of special practice with the tests. Any decrease in gain 
with age, if such there be, is offset by the selection of those 
more capable of gain. 

REFERENCES 


Brooks, F. D., ’21: Changes in Mental Traits with Age. Contributions to Educa- 
tion, No. 116, Teachers College, Columbia University. 

Cobb, M. V., ’22: The Limits Set to Educational Achievement by Limited Intelli- 
gence. Journal of Educational Psychology, Vol. xiii, pp. 449-464 and 546-555. 

Woolley, H. T. and Fischer, C. R., ’14: Mental and Physical Measurements of 
Working Children. Psychology Monoz., No. 77 

Wooley, H. T.,’ 15: A New Scale of Mental and Physical Measurements for Adoles- 
cents and Some of Its Uses. Journal of Educational Psychology, Vol. vi, 
pp. 521-550. 





— 


: ae 





THE RELATIVE PREDICTIVE VALUES OF CERTAIN 
INTELLIGENCE AND EDUCATIONAL TESTS 
TOGETHER WITH A STUDY OF THE 
EFFECT OF EDUCATIONAL 
ACHIEVEMENT UPON 
INTELLIGENCE 
TEST SCORES 


ARTHUR I. GATES AND JESSIE LA SALLE! 
Teachers College, Columbia University 


SUMMARY 


This study is based upon 75 pupils tested during two school years 
at intervals of four months with a battery of achievement tests, and 
twice at an interval of 12 months with the Stanford-Binet and the 
National Intelligence Tests. The main results are as follows: 

1. The obtained correlations between the National Intelligence 
Tests and achievement tests, and the intercorrelations of achievement 
tests decrease steadily as the intervals increase from 0 to 20 months. 

2. The Stanford-Binet predicts achievement 20 months later about 
as well as for shorter periods. 

¥. When corrections for the unreliability of tests are made, it 
appears that the best means of predicting achievement in a particular 
subject, reading, spelling or arithmetic, is to use a test of that subject 
itself. 

4. The usefulness of an intelligence test, because of the universality 
of its currency, is indicated, however. 

5. That the National Intelligence Test reflects in a measure the 
effects of information and skill progressively accumulated in school is 
indicated by a positive correlation between gains in National Intelli- 
gence Test and gains in achievement during a period of 12 months. 

6. That the Stanford-Binet reflects slightly, or not at all, the effects 
of schooling under the conditions of the experiment is indicated by 
zero correlations between gains. 

7. An analysis by means of the correlation of columns of a 
correlation table (Spearman’s criterion) and by means of partial corre- 
lations agree in suggesting that intelligence is not a quality, everywhere 
one and the same, but a composite or average of many abilities 
variously related. 





1 Based on data secured in the Scarborough School, Scarborough, N. Y. during 
the academic years 1920-1922. 


517 

















ae i eee = 


Se a CORSE RE On Te See at aa ae See 
: - a eel =2 - : - a 3 
= rhs Te oe = * an ~ = - 


nem herby a Si out game Fixe ee a Pb at ee ae 
IB Aen Se ee pn 0 akin aye ee nh eee 
- = ie an: ae 





i 
i 
: 

ih 


ry 
4 





518 The Journal of Educational Psychology 


THE GENERAL PROBLEM AND METHODS 


The intelligence test and scale, the invention of which is accredited 
to Binet, and tests and scales of scholastic achievement which 
originated with Thorndike, have enjoyed a parallel and, until recently, 
a somewhat independent growth. The group tests of intelligence 
developed out of the superior technique of the achievement tests and 
embraced in no small measure identical content. Nevertheless, it 
has been quite commonly assumed that intelligence tests measure one 
type of thing, namely, native ability or inborn capacity, whereas the 
scholastic tests measure another thing—school achievement, acquired 
ability. The wide use of the accomplishment ratio is substantial 
testimony of this fact. 

This paper presents an inquiry into several pertinent relations 
between typical representatives of three types of tests, the individual 
Stanford-Binet -Intelligence Scale, the National (group) Intelligence 
Test and several group tests of scholastic achievement. 

The function of the intelligence test is prediction. The fact that 
intelligence tests give substantial correlations with achievement in 
school subjects measured at the same time—a fact to which nearly all 
of our studies are confined—is alone insufficient justification for the 
use of the intelligence tests. It may be that future success in school 
subjects is predicted more accurately by scholastic tests themselves 
than by intelligence tests. This possibility, at any rate, will be one 
of the subjects of the present inquiry—the relative predictive value of 
intelligence and educational tests. 

The subjects utilized were pupils mainly from Grades III, IV, V 
and VI of the Scarborough School in the academic year 1920-21 who 
also completed all of the work during the year 1921-22, about 75 in all. 
They were given the Stanford-Binet once in each year, mainly during 
the first semester; the National Intelligence Test once in October of 
each year, and a battery of tests in reading comprehension (Thorndike- 
McCall); reading rate (two forms of the Courtis or Burgess or one of 
each) ; arithmetic (all four forms of the Woody) and spelling (60 words 
from the Ayres Scale). The achievement tests were given in October, 
January, and May of each year, i.e. tests at intervals of approximately 
four months during two school years. From these data, it will be 
possible to compute the correlations between tests given at the same 
time or at intervals of 4, 8, 12, 16 or 20 months, 

The National Intelligence Test (hereafter designated by NIT) 
given in 1920 was correlated with six tests of comprehension in reading, 





ee ee ee ee ee ee oe oe EE ee <n  ¢ ? nd 


~~ ~eoleeeelCU 





OOO TNS TD SS Cl ' SS 


_lUCUOC«*S mw ww OCD 


—~s 


Predictive Value of Certain Tests 519 


one in October, 1920; January, May, October, 1921; January and May 
1922, and similarly with six tests each for rate of reading, arithmetic 
and spelling, making a total of 24 coefficients. There were 24 coeffi- 
cients, also, between the NIT given in 1921, and the several scholastic 
tests, or 48 for both. Similarly there were 48 r’s between Stanford- 
Binet Mental Ages (MA) and the educational tests. The inter- 
cor relations of the achievement tests are more numerous—36 
between each- group of six tests for each subject, making in all 432 
coefficients. 

Since the group embraced the range of ages found in four grades, 
the correlations will be relatively high, but since the children them- 
selves in this school were selected, few having an IQ of less than 100, 
the coefficients will, by this means, be reduced. What the coefficients 
would be in the case of an unselected group of the same age measured 
by perfectly reliable instruments could be estimated only at the cost 
of great labor and then only roughly. At present, the method of 
correlation serves its purpose most fully in portraying relative associa- 
tions. This may be accomplished in the present study with a high 
degree of reliability since the same subjects, and no others, are used 
throughout. The reader will understand that the correlations 
presented below mean little except as one is compared with others 
obtained in this study, and even this statement demands certain 
qualifications which will be made in due time. 

Predictions for Varied Periods of Time.—The first question is 
concerned with the influence of an interval of time between tests upon 
the correlations. In Tables I, II and III are given the correlations 
arranged according to the intervals between tests. Table IV is a 
summary of Table I, IT and ITI. 


TaBLE I.—CorRRELATIONS OF NATIONAL INTELLIGENCE Trst, OcTosER, 1920 wiTH 
Scuoxiastic TEsts 





























Interval in months 0 4 8 12 16 20 
Reading comprehension............ .88 82 80 82 73 72 
aches ssh twe beeen ae .83 .82 .82 .82 .76 .72 
hdc ccoscneehe eaten 91 .89 91 91 .83 .78 
MEE c a oes tees crebeeneed .85 | .78 | .82 | .84 | .84 | .80 
Gs oe karan bis s 0.0 atlee wem a .868 | .828 | .838 | .848 | .790 | .755 





dn ons n 
pene 


. 


es 


a, oto 


+> > Be - 











‘] 


Pe See Se Oe ee 
_ 


i Er 


i 
‘ 
¥ 
; 
6 
\ / 


520 The Journal of Educational Psychology 


CORRELATIONS OF NATIONAL INTELLIGENCE TEsT, OcToBER, 1921 wiTH 
Scuo.iastic TEsts 











Interval in months 0 4 8 12 

Reading comprehension............ . 87 .80 | .78 | .83 
.82 .81 

ESOS ee. .89 .83 .78 .80 
.80 | .82 

ids at's ssipnnieth aesttiniadiae as .92 .89 .85 .85 
.90 | .88 

i dik 4 dn ats ah od © % dhe kets .82 .78 .78 .76 
.81 .77 

Rs iealim saath cites take Ja awe. .875 | .828 | .809 | .81 























TaBLe II].—CorRELATIONS OF MA 1920 witrH Scuo.uastic TEstTs 











Interval in months 0 4 8 12 16 20 
Reading comprehension............ .74 | .73 46.4.8 44% .75 
ETS rr .65 | .70 . 66 .67 .65 .68 
iinet apt Rig Rl REO .77 | .83 .85 . 86 .84 .83 
conc wnaeds tether 60 | .63 | .61 | .62 | .61 || .59 
BG POHOS Gist Fee BTS Sees % .69 | .723 | .725 | .718 | .703 | .713 




















CoRRELATIONS OF MA 1921 wits ScHo.astic TEstTs ! 











Interval in months 0 4 8 12 

Reading comprehension............ 78 | .72 .72 77 
.76 .76 

 hocciacdadhee keeewnne .63 .56 .60 .63 
.64 .65 

Nt ities bak 52> occ ae enn es .83 .83 .76 .72 
.83 .81 

I atiens wha dane das pbk de eiled .62 .59 .58 .59 
.60 .60 

BN Lhe. 0 danas od cine sos eaten oe .715 | .717 | .710 | .678 























1 Not all of the Binet Tests were given in the same mouth; they were grouped in 


the modal month, and 1Q’s multiplied by CA’s of that month to give the MA. 


ahd 


ra | 


| 


| oe 


Loh | 


« hoa | 

































































































































































a 
ee 
sap i 
Predictive Value of Certain Tests 521 4 a 
TasLe III].—CorRELATION oF READING COMPREHENSION WITH READING RaTE ia 
INTERVALS IN MONTHS tee 
Same time 4 months 8 months 12 months | 16 months | 20 months t ; 
aA 
.69 .70 .70 .75 .70 .74 71 .63 Pick 
.78 77 .69 .70 71 .66 .68 .64 ot 
75 .75 .80 .60 74 .69 .60 Aer ith 
77 .68 .76 .69 .78 .67 .64 ca 
67 .69 .60 “eps mys .62 fey 
73 > .74 Pai 
ali 
| BH: 
Ns on acine's .732 .714 .709 .687 .658 .635 is 
ie +osaveas | (011 ‘011 012 ‘011 :014 :010 ee 
a —_ — ” ¥ ; "% 
Correlations of Reading Comprehension with Arithmetic : W 7 
79 72 | .80 | .74 | .78 | .74 .70 .65 Pr 
75 .76.| .78 85 .79 .65 .64 71 eis 
75 .76 .81 .72 .70 .73 .79 nF 
84 .74 .83 .73 .79 .79 72 te 
.85 .73 71 ip 49 75 ahd 
-78 eee eee oes ove 84 oon: 
cy 
me 
ee Pee .793 . 764 .763 .749 .713 .68 eu 
ae ee .009 .008 O11 .012 .013 .014 peat 
‘Bet 
Correlations of Reading Comprehension with Spelling ¥y | 
.72 71 .70 .70 .76 .70 .60 .66 a 
.73 .70 .72 .73 64 .70 .69 .68 oe: 
.74 .72 .74 .69 .73 .66 .76 % 
.80 .69 .76 .63 .72 .69 . 66 eS 
70 .64 .64 a oss .65 wt 
| .70 a: .68 } 
ES ee | .708 701 .680 .678 .67 + 
ES | 010 .008 .010 .005 .010 .005 es. 
| 4 ; 
——— . “ee 
Correlations of Reading Rate with Arithmetic ¥ f " 
Seay 
«Ow 
78 66 75 | .73 | .70 72 62 56 i 
68 76 73 71 .70 61 53 68 a 
69 67 65 .62 .68 57 73 AA 
75 63 71 58 70 74 67 ei 
75 69 63 +“ 71 ia 
70 68 we} 
i B yarn @ aie 
BING 5 Siesieg’ 725 .688 .678 .672 .638 .62 ae} 
Nits sities 32 .015 .101 O11 .017 .025 .029 Wal 
onan _ . -_ 7 1 4 
Correlations of Reading Rate with Spelling Z 
.75 .69 | .72 | .72 | .72 | .72].73] .66 .63 a 
.73 .73 .76 71 .73 | .66 | .68 .61 .68 " 
.72 .70 .77 .60 .79 | .63 | .65 .73 a ih 
.73 .67 74 .66 67 65 ltt 
"83 "68 68 ai 
.65 cre 
Mean......... .735 .705 .70 .679 .663 655 a: th 
Slab dncece esi .012 .007 .013 .010 015 .013 eae! 
. 4 - 
* Mt 
1 Fue lt 
4 
aa | 
i ve ) 
: ht 








art ar 


f 
q 
; 
: 
; 
4 
‘ 

4 
: 


ee 
nen, Bs ae 


522 The Journal of Educational Psychology 


TaBLe III.—Continued. 
Correlations of Arithmetic with Spelling 























.73 74 .75 .72 -75 75 .60 64 

64 .80 77 -71 72 .70 .73 .70 

.82 -68 .74 - 60 .81 64 .79 

72 72 .60 72 -68 .72 .68 

79 73 - 76 .70 

77 ee .69 

| 

a .745 .729 .714 -70 -70 .67 
a eee .012 .009 .009 .006 013 .O11 














TaBLeE I1V.—Summary or TaszEs I, II anp III 


Showing the average correlations of tests indicated in first column at the left with 
the individual achievement tests 





0 4 8 12 16 20 
months | months | months | months | months | months 





National Intelligence 


peepee parti .872 .828 .824 ..829 .790 .755 
| AS .720 .719 .698 .703 .713 
Comprehension........ . 752 .729 .724 .705 .683 .662 
ETS ba Sa t4.0'0 We .731 .702 .696 .679 .653 .637 
Arithmetic............| .754 .727 .718 .707 .683 .656 
ss in 65 maks we . 737 .714 .705 .686 .680 .665 























A survey of the detailed data or the summary will disclose one 
important fact: All tests except the Stanford-Binet show a gradual 
and fairly uniform decrease in the correlations with achievement as 
the tests become more widely separated in time. While it starts with 
a markedly lower correlation than the NIT and slightly lower than the 
single scholastic test, MA “holds up” better—in fact, it maintains a 
level of association. It correlates with achievement 20 months 
hence quite as well as with achievement at the time it was given. 

Despite the steady decrease in the correlations between NIT and 
achievement with the increase in the intervals between the tests, the 
NIT shows at 20 months a correlation with these subjects that is higher 
than that given bythe MA. Judging from the general trend, however, 
it appears that the former may, sooner or later, descend to or below 
the predictive value of the latter. This possibility, however attrac- 
tively implied in the data, should not be assumed as certainty. 





Predictive Value of Certain Tests 523 


Any single achievement test gives a very fair correlation with 
achievement in other functions, on the average, although not as good, 
after 16 months or more, as either of the intelligence tests. If, how- 
ever, we make up a kind of imitation NIT by combining the scores of 
all achievement tests, and correlate them with single measures of 
achievement, as was done both with NIT and MA, we secure an equally 


good predictive instrument. A sampling of the results thus obtained 
are as follows: 


TaBLE V.—CoRRELATION OF Four EqQuaLLy WEIGHTED EpUCcATIONAL TESTS 
WITH SINGLE ACHIEVEMENT Tests SELECTED AT RANDOM 








Same time 4 months 8 months | 12 months | 16 months | 20 months 
.85 .80 .86 .80 .83 .79 
.88 85 .87 77 .80 .75 




















These r’s are too few to yield very reliable results, but they suggest 
correlations at least as high as those given by NIT and higher than 
those by MA. : 

How Well Does an Achievement Test Predict Itself?—The next 
consideration is: How well does reading or arithmetic predict itself? 
Does arithmetic predict itself as well as the MA, NIT or a composite of 
several school subjects predict it? The data are given in Table VI. 

It should be noted that the self-correlations are based on tests 
given at intervals of four months or more, since but one test was given 
at a time for each function. In comparing the self-correlations with 
the intercorrelations of Table IV, the first columns in the latter table 
should be disregarded. 

A comparison of the means of Table IV and the means of Table VI 
will disclose the fact that a test in comprehension or rate of reading, 
arithmetic or spelling will predict future achievement in the same 
function better than any one will predict future attainments in any 
other function—the average of the self-correlations is .825 compared 
to .702 for the intercorrelations of the several tests, or averaging the 
self-correlations of test separated by 20 months, we obtained a 
mean coefficient of .802 as compared to a similar mean of the 
intercorrelations of .655 

The tests of comprehension and rate of reading here used, however, 
predict themselves 16 or 20 months later slightly if at all better than 























¢ 
4 

y 
‘+ 
3 


a 


# 
; | 
S 
7 


524 


TaBLE VI.—SHOWING THE SELF-CORRELATIONS FOR THE SEVERAL SCHOLASTIC 


The Journal of Educational Psychology 


TESTS 





Reading comprehension with reading 


comprehension 


Reading rate with reading 
rate 





Intervals between tests in months 





Intervals between tests in months 





























“i. 8 | 12 16 | 20 4 8 12 16 | 20 
By .78 .70 .76 .75 | .80 .80 82 .78 .74 
.76 74 .78 .76 .78 77 .69 .67 
.85 .80 .77 .86 .73 .69 
.75 a .74 .78 
.80 .80 
Mean .79 .77 .75 .76 .75 | .80 .77 .73 .73 .74 
PE .010 .008 | .014 |} O 0 .012 | .009 | .024 | .026 0 
Arithmetic with arithmetic Spelling with spelling 
4 8 12 16 20 4 8 12 16 20 
91 .86 91 .89 84] .91 .93 .88 91 .88 
.88 .88 .86 .86 .90 .88 .87 85 
91 .87 .89 .88 91 .85 
.89 .89 91 .86 
.96 .88 
Mean .91 .88 .89 .88 84 .90 .90 .87 .88 .88 
PE .O11 | .005 | .008 | .008 0 .004 | .010 | .005 | .014 0 
































the NIT or the Stanford-Binet (or a composite of all of the educational 
tests), but the test of arithmetic predicts itself slightly, and the test of 
spelling apprecially more accurately than the intelligence tests, as 


shown in the coefficients repeated below in Table VII. 


Under the conditions of the present study, the following practical 
recommendations concerning the use of the tests here studied would be 


as follows: 


1. To predict future scholastic achievement in general for a period 
of 16 months or less use the NIT or a composite score obtained by 


combining the results of the educational tests. 








I 
/f 


I 

















Predictive Value of Certain Tests 525 
Taste VII 
EN a-b hae bbe ade A6 ba ae a 608 ce 16 months | 20 months 
Se, CN cn wah carcravpeabeeeess .73 .72 
MA with comprehension................cccccececee: 71 .75 
Comprehension with comprehension.................. .76 .75 
NG Gb trincsinas ciel nelcone dulab cdelent .76 .72 
RS rN ie ds cu Penns dh wats os .65 .68 
it én ccas ec hne sy tnenedodnmmeenheans .73 .74 
we RT ee eee er eee .83 .78 
SE ET ET eS Oe oe .84 .83 
Arithmetic with arithmetic. ..............0.eeee08-- .88 . 84 
6 i wk che 6 eho t enue oe pd Mae's he .84 .80 
adh bicee Ghee stee<s couse odeses .61 59 
RR A ee ea oe .88 .88 











2. To predict future achievement in general at a period of 20 
months, the Stanford-Binet and the NIT are equally good. Probably 
but not certainly for periods more than 20 months later the Binet will 
yield the higher correlation. 

3. To predict achievement 16 or 20 months later in arithmetic or 
spelling specifically, use the achievement tests themselves. For 
predicting comprehension in or rate of reading, the specific reading 
tests are no better than the NIT and but little better than the 
Stanford-Binet. 

4. The Stanford-Binet differs from the other tests in showing no 
decrease in correlations with educational achievements, in general, as 
the intervals between tests increase from zero to 20 months. 

The Correlations between Several Tests Rendered Equal in Reliability. 
The correlations used in the preceding discussions were those 
actually obtained. The magnitudes of those correlations depend 
not only upon the intimacy of the associations of the functions but also 
upon the accuracy with which the functions are measured by the 
particular tests used. Other things being equal, the more reliable the 
test the higher the correlation. If we wish, therefore, to ascertain how 
closely intelligence, as represented in these tests, correlates with the 
type of reading, arithmetic, or spelling tested, or how well these types 








eae ae sD a 


OF eras, 


a 
¥ 





526 The Journal of Educational Psychology 


of achievement correlate with each other, we must make allowances 
for the differences in reliability of the instruments; we must make them 
equally reliable. 

If a test were entirely reliable, two forms of it should give the same 
relative positions when applied with a short interval between to the 
same group of subjects ,z.e., the correlations between the two tests 
would be 1.00. Insofar as this self correlation is less than unity, the 
tests are relatively unreliable. The simplest way, statistically, tomake 
all our tests equally reliable is to make certain corrections which make 
them perfectly reliable, z.e., to yield a self correlation of 100. The 
device for this purpose is the formula for correction for attenuation.' 
To correct for attenuation, it is necessary first to secure the correla- 
tions of a form of a test with another form, or the same form, given at a 
brief interval—a correlation usually called the reliability coefficient. In 
this investigation, the shortest interval between two tests of the same 
function was four months, followed by others at four-month intervals. 
What the correlation would be between two tests given at shorter 
intervals or at approximately zero interval must be estimated. This 
was done by computing the average decrease in the self correlations 
for a four months period by comparing the several intervals 4-8 
months, 8-12, etc. This average amount has been added to the aver- 
age of the self correlations at a four-month interval, giving us, at least 
approximately, the self correlation of tests given in immediate succes- 
sion, that is, the reliability coefficient, which is given below for each 
test. 

RELIABILITY COEFFICIENTS 


I 655 i, eA ae Wklde sb noite Swe bien sarees .80 
Ne re a are'g sss bk d oe CER ES renew. 81 
ee kk AW Oe a PRL CP .93 
he a wnthn A eae Dace ed.) SOE a .90 


For the intelligence tests, lacking crucial data but taking into 
account the results of certain other studies, the actual self-correlations 
at an interval of a year (.93 in both cases) are accepted as 
approximately correct reliability coefficients. 

In Table VIII are given the correlations which would be obtained 
if all of the tests were equally reliable. In column (1) are given the 
corrected correlations for tests given at an interval of 16 months; in 
column (2) at an interval of 20 months, and in the last column the 
mean of both periods, to secure the most reliable coefficients. 


1 This is discussed by T. L. Kelly in his “Statistical Method,” New York: 
Macmillan 1923, pp. 204f. 





’ ~- ’F all hl 


\ey - nal 


SS hl hc CO 


er 


PO = CO me 

















Predictive Value of Certain Tests 527 

TaBLeE VIII.—Corre.atTions CoRRECTED FOR UNRELIABILITY OF MEASURES 

(1) (2) Average of 

Intervals 16 months | 20 months | (1) and (2) 
NIT with comprehension.................. .855 .843 .849 
eT es FES 6 bes oka cbs oe OAbee .879 .833 . 856 
i Ee eee .900 .846 .873 
SS ae ee en | .920 .877 .898 
Tee eases alee .888 . 850 . 869 
MA with comprehension................... .831 .878 .854 
EE ee es! . 787 . 769 
MA with arithmetic....................... i. 682 911 .916 
SE ES ee | 669 646 657 
I on 6 >. s+ aaend Keeeian sine «al |  .¢04 . 806 . 799 
Comprehension with comprehension.........| . 959 .947 .953 
I 6 Gg cient a ie'e's « Rubin k' a s-« 2 aM | ,901 .913 .907 
Arithmetic with arithmetic.................| .954 .922 .938 
Spelling with spelling...................... +974 .974 .974 
OE eee ee Pee .947 .939 .943 











With perfectly reliable tests, correlations are obtained which 
differ in certain respects from those actually obtained, as shown in 
Tables IV and VI. The NIT shows correlations of nearly equal mag- 
nitude with the four school subjects whereas the Binet predicts most 
accurately achievement in arithmetic, with comprehension a close 
second, rate third and spelling alow fourth. The corrected coefficients 
also show that the several functions, had the instruments of measure- 
ment been perfect, would predict themselves 16 or 20 months later 
nearly equally well. Atthetwenty-month interval, the self correlations 
are already higher than the predictions made by NIT which are, in 
turn, very slightly higher than those of MA. 

With better instruments for measuring achievement, then, the 
test of a school subject itself is the best predictive measure of future 
achievement in that subject. 

That measures of ability in a school subject are excellent indications 
of future achievements is a fact (although by no means a new or even a 
recent fact) of utmost importance in the understanding of the princi- 
ples underlying the attempts to measure general and specific native 


Mi 
LP: » 
= 


= See 








ie 
; 
4 
a 
-B: 
Et 
- 
| ea | 


Ste tO LO 


t 
$ 
? 


aS 
iq 
nae 
ta 

} 


528 The Journal of Educational Psychology 


aptitudes or capacities. Those who have assailed the intelligence 
tests, insisting that, since most of them measure, as they do, in various 
degrees the actual scholastic attainments of individuals, they cannot 
represent native capacity nor be used for prediction, have not given 
due weight to the fact that under certain conditions demonstrable 
ability is a very excellent index of capacity. In connection with the 
present results, it should be stated that the permanence of aptitude 
in these subjects, especially reading and spelling, was given a very 
severe trial inasmuch as throughout these two school years, very 
special attention was given to those retarded.! Their difficulties 
were the subjects of prolonged study; the retarded pupils received 
unusual remedial treatment involving greater time and attention than 
were devoted to others. The relative positions of the pupils were not 
greatly changed however, after two school years of teaching which 
was, as a matter of fact, aimed at changing them. 

Remarks, then, that intelligence tests merely measure ‘‘education”’ 
or “acquired intelligence,” if carrying an implication that they are 
therefore not indicative of native capacity, miss the point. Under 
conditions of approximate educational equality, achievement is a most 
excellent symptom of native capacity. We know native capacity, and 
measure it, as we know and measure electricity, gravity or typhoid 
fever; we know it from certain symptoms; we measure it by measuring 
the symptoms after knowing the correlations of symptoms with the 
capacity, form of energy, or disease. 

Why Use Intelligence Tests?—-We may now raise the question: 
Why use intelligence tests, since achievement tests predict future 
attainments under the conditions of the present study as well as the 
intelligence tests? 

In the first place, due weight must be given to the conditions of the 
present investigation. The data were secured only from those pupils 
who were in the same school, who had taken all of the tests, were sub- 
ject to the same methods of teaching, the same distribution of time and 
attention to the several subjects, and were held to similar ideals of 
achievement. Under such conditions the prophesies from achieve- 
ment tests are maximum. If we had selected pupils from various 
schools, with varied curricula, varied standards of achievements, 
varied grade placements of material, varied time allotments, varied 


1 As described in one of the writers’ monographs. ‘‘The Psychology of Read: 
ing and Spelling with Special Reference to Disability.”” Teachers College- 
Bureau of Publications, 1922. 


a tn 4 fale «.../ oe 0UlD 


“a _s a" — 1 na  hpp-, _> 





Predictive Value of Certain Tests 529 


merits among teachers, the correlations with future achievement of the 
pupils would probably be lower than those which appear in this study. 
In other words, the achievement test may be expected to predict 
future achievements with a high degree of fidelity only when all of the 
pupils have been subjected to the same, or at least similar, 
educational treatment. 

Another, and perhaps more important matter, is the equality of 
experience with the tests. We know that repeated testing with the 
group tests usually results in an increase in the scores. When all of 
the individuals have been given the same number of tests at the same 
time, the scores will increase but this will not greatly influence the cor- 
relations. If, on the contrary, a group is composed of some who have 
never been tested before, others previously tested one, two, etc., 
times, the resulting scores will probably have less predictive value since 
each score depends in an appreciable measure upon previous experience 
with the test. 

The Stanford-Binet is, like any other test, susceptible to practice 
effects.! If tests are repeated at a wide interval—say a year—this 
effect is not great. For the 75 pupils retested after an interval of about 
a year, the average gain in IQ was +2.2 points. Similar results have 
been found by others.2 That the Stanford-Binet is susceptible to 
practice effects transferred in any great amounts from experience with 
group tests is unlikely judging from results elsewhere published.* 
Practice influences, then, for the Stanford-Binet can probably be easily 
ascertained and allowed for. 

The influence of educational attainments, school information and 
skill, presents a more difficult problem. Over this influence many 
disputes have been, and still are, waged. There are two convictions, 
the first of which, shared by many, is here given in the words of Ter- 
man: ‘There is no reason to believe that ordinary differences such 
as those obtained among unselected children attending the same 
general type of school in a civilized community, affect to any great 
extent the validity of the scale.’”’ In contrast to this view, is one 





1 As shown most clearly by Katherine Graves in an unpublished Doctor’s 
Dissertation, Teachers College Library. 

2 See a summary by Rugg and Colloton: Constancy of the Stanford-Binet IQ 
as shown by Re-tests, Journal of Educational Psychology, 1921,\Vol. XII, pp. 315-332 
and by Baldwin and Stecher, Mental Growth Curve of Normal and Superior 
Children, Unirersity of Iowa Studies, Vol. II, No. I, 1922. 

* Gates, A. I.: The Reliability of MA and IQ Based on Group Tests of General 
Mental Ability. Journal Applied Psychology, March, 1923. 














ee ee ee 
3 dee em = eee ee 


; 
” 
5 
ae 
+ 2 
. 
4 
y 
; 
¥ 
y) 
® 





530 The Journal of Educational Psychology 


expressed by Cyril Burt:! “There can be little doubt that with the 
Binet-Simon Scale, a child’s mental age is a measure not only of the 
amount of intelligence with which he is congenitally endowed, not only 
of the plane of intelligence at which in the course of life and growth he 
has eventually arrived; it is also an index, largely, if not mainly, of the 
mass of scholastic information and skill which, by virtue of attendance 
more or less regular, by dint of instruction more or less effective, he 
has progressively accumulated in school.” More specifically, Burt 
estimates that of the gross mental age, one-nintn is attributable inde- 
pendently to age, one-third to native intellectual endowment, and over 
one-half to school attainments. . 

The practical significance of the Binet Test depends upon which, 
if either, of these sharply contrasting views is correct. How may 
the facts be determined? 

Burt undertook a solution in the following manner. For about 300 
pupils of ages between 7 and 14, he secured the chronological age, 
Binet Mental Age and education age by means of tests, the results 
of which were revised by the teachers. Between scholastic attain- 
ments and mental age was an obtained correlation of 0.91. 

The task, now, is to explain the cause of this correlation. Does the 
observed correspondence result from an influence of intelligence upon 
achievement, or is it the reverse; is the Binet Mental Age determined 
“largely, if not mainly, by the mass of scholastic information and skill 
—progressively accumulated in school,” or are both related to other 
factors which influence them similarly if not equally? 

The group of pupils displays a wide range of ages, for one thing, 
and since both mental age and school achievement increase with age, 
part of the observed correlation is due to this factor. Correlating both 
mental age and school attainments with age, it is possible, by the 
technique of partial correlation to eliminate the influence of age. 
The residual or “ partial’ correlation of mental age and school attain- 
ment with age eliminated, is +0.68, a substantial correlation. We are 
left, however, without an inkling as to the cause of this relation. 

Burt went one step farther. He attempted to eliminate pure 
intelligence from the correlation. If this could be done, the residual 
or partial correlation might fairly be taken to indicate the influence of 
scholastic achievement on performance in the Binet test. But to 
eliminate native intelligence it was, of course, necessary first of all to 
have a measure of it. Burt accepted as a criterion of intelligence, 





+ “Mental and Scholastic Tests,” London: P. 8. King, 1921, p. 182. 





\w #826 wee wer Sslté<C<« i ee al 


ee ee, Oe ed 


Predictive Value of Certain Tests 531 


ratings obtained from his Reasoning Tests, ‘‘the results of which were 
revised by the teachers.” Assuming that this was an index of native 
intellect, by proper statistical methods, Burt obtained the follow- 
ing regressions which “indicate the relative proportions in which the 
three factors—age, intelligence, and school attainments—together 
determine a child’s achievement in the Binet-Simon Tests.”” Theequa- 
tion is: MA = .54 school attainment + .33 intelligence + .llage. It 
is on this equation that Burt’s statement, earlier made, was based. 

The validity of Burt’s conclusions depend entirely on the validity 
of his criterion of pure intelligence. Few would be willing to admit, 
on the basis of evidence that Burt has offered, that the reasoning tests 
or that teachers judgments or both together may be properly used as a 
criterion by means of which the Binet may be evaluated. Until a test 
known to be reliable and also a valid measure of native intellectual 
capacity shall have been devised, it will be impossible to estimate the 
influence of education in the easy manner which Burt has utilized. 

Some of the material collected in the present study affords a check, 
even if a very rough one, on Burt’s hypothesis. Selecting children 
tested by the Binet twice, at an interval of approximately twelve months, 
it is possible to compare the growth in mental age (or change in IQ) 
with the advancement in educational attainments. If it is true, as 
Burt contends, that the score on the Binet test is “largely, if not 
mainly” due to “information and skill—accumulated in school,’ it 
should follow that those children who make relatively great educational 
progress should tend to make relatively great advancement on the 
mental tests. It may be illuminating also to compare gains in scores 
on the NIT, as well as on the Binet, with increases in achievement in 
school subjects. 

For the Binet, gains in MA and IQ, for the NIT gains in raw scores, 
and for the four education tests, gains in terms of ‘scaled’ units, 7.e., 
units equal at any point on the scale, were computed. It will be 
advisable to make comparisons with the same grades since to compare 
gains at different levels, especially in case of the intelligence tests, is a 
questionable practice. 

The central tendency of the association of gains in either intelli- 
gence test with gains in the various subjects is, of course, depicted by 
the coefficients of correlation. They are shown in Table IX. 

In Table IX it appears that advancement in scholastic achieve- 
ment exerts a mild influence on scoresin NIT. Surveying the several 
grades, it appears that rate of reading has the greatest and most con- 





























532 The Journal of Educational Psychology 
TaBLe 1X 
Correlation of gains in NIT with gains in 
Read- 
Read- ; . , 
ine ing Arith- Spell- Writ- MA 1Q 
compre-| metic ing ing 
rate A 
hension 
Grade III......... .23 — .04 .02 17 .25 .28 . 26 
Grade IV.......... .27 .16 — .04 .05 .09 .03 .05 
a .33 .20 .00 | —.07 | —.04 .06 .02 
pe .10 .38 22 .06 | —.18 .05 — 06 
PU cee es .23 .175 .05 .04 .03 .105 .07 


























Correlation of gains in MA with gains in 











Reading : 
Reading | compre- aa Spell- | Writ- 
metic ; IQ 
ing rate | hension weal ing ing 
Th-McC y 
Geade EET.............| —.08 .00 | —.08 .04 .00 | .955 
DE cccss seh cca .03 .04 .07 — .08 —.05 | .887 
a A eae .00 18 .00 | —.03 .05 | .85 
Grade VI....05...05.. —-.03 |} —.01 .12 —.10; -—.12) .91 
SEs oe — .02 .05 .03 — .04 — .03 .90 























Correlation of gains in IQ with gains in 











a ae — .04 .02 —.13 .16 — .02 
in ae — .08 .16 .21 — .02 .02 
i ee ee HO .08 .03 — .03 .00 — .16 
CPE cacceeleckde ss .05 — .04 — .15 .07 — .03 
ay ere eer .00 .04 — .03 .05 — .05 




















sistent effect, yielding an average correlation of 0.23. The PE of this 
coefficient is approximately .07, or less than one-third, hence the 
chances are 24 to 1 that the correlation is not due to the inadequacy of 
the data.! 


1 For any average correlation of .14 or more, or for any single coefficient of .22 or 
more, the chances are 4 (or more) to 1 that it is significant of a genuine association. 








~~ en Oh te mhlUCUrmhlC rh lCUrlhCUllClCU SD COO! CUD 


ae. er ——_ — Aa 





ye ‘ 


mre 








Predictive Value of Certain Tests 533 


Study of the dependence of the correlations on the grade discloses 
several tendencies of interest. The influence of writing is appreciable 
in Grade III, scarcely so in IV, and tends to be negative in V and VI. 
This result is quite in accord with our knowledge of individual children 
who wrote with painful slowness at the beginning of the year when the 
first tests were given in Grade III. The same was true of reading rate, 
the influence of which is a factor in higher grades as well. An extreme 
case in Grade III, a child backward in both reading and writing, 
scored 35 on the first NIT, and a year later 138. Reading compre- 
hension shows a change quite the reverse of writing; the correlation, 
approximately zero in Grade III, rises to .38 in Grade VI. Spelling 
and arithmetic, except for the r of .22 for the latter in Grade VI, show 
very low correlations. 

While some of the average correlations of scholastic and NIT 
gains are too small to satisfy the statistical requirements of substantial 
reliability, it is nevertheless a notable fact that all are positive; reading 
rate and reading comprehension are reliably so. All told, they indi- 
cate a genuine but slender association which may most probably be 
interpreted to indicate one or all of several things: (1) They may be 
due to the influence of “‘scholastic information and skill progressively 
accumulated in school’ upon the NIT results; (2) they may indicate 
the influence of zeal and determination to excel in tests of all kinds; 
(3) they may indicate the development of skill, facility, technique in 
the taking of group tests which may, in a measure, pervade all of the 
types here used. All of these influences have, we believe, some weight; 
of the three the first probably has the most; the second, the least influ- 
ence. It may then be said with some confidence that educational 
advancement does influence achievement on the NIT. 

Before taking up the correlations of gains on the Binet Test 
with gains in achievement, the range of gains on the former should 
be considered. Measured in terms of IQ they were distributed as 
follows: 


























: 3 , 
IQ change -17) — 16 to -14| —13 to —11 | —10 to -8| —7 to ied —4to -2| —lto+1 
Number cases. 1 | 0 | 2 | 1 | 5 | 13 | 11 
IQ change | +2 to +4| +5 to +7| +8 to +10 +11 to +13 +14 to +16 +18 
Number cases............ | 12 | 16 | 10 | 2 | 2 | 1 

















, 
. 
2) 


$ 
p@ 
va 
. 
“a 
> 
+) 
7) 
a 

3 





ab 


se ee 


ee 


534 The Journal of Educational Psychology 


The average change, all changes treated without regard to signs, 
is 5.35 points IQ which is somewhat larger than those usually found. 
The averages of the changes (7.e. plus changes minus minus changes 
divided by n) is a gain of 2.2 points IQ. The range of changes is 
unusually large, thus giving more than usual significance to any 
comparisons with them. 

Table IX gives the correlations of gains both in MA and IQ with 
other gains. It will be observed that the correlations between IQ and 
MA gains are very high—.90 on the average. Both show, conse- 
quently, quite similar associations with other gains. 

The general impression of these tables is unmistakable ‘no 
correlation.’”’ Few of the single coefficients are twice the Probable 
Error; negative values are nearly as frequent as positive values. 
There is no evidence that educational advancement during a school 
year in any line here represented has any influence upon the MA or 
IQ earned at the end of the year. These data bear witness to the cor- 
rectness of Terman’s statement that ‘‘ There is no reason to believe that 
ordinary differences, such as those obtained among unselected children 
attending the same general type of school in a civilized community, 
effect to any great extent, the validity of the scale.” 

Let it be freely admitted that our data afford no conclusive test 
of Burt’s contentions, with which they surely fail to accord. A year 
is a short interval, the educational treatment of our cases was not as 
extremely differential as may be imagined, measures of differences are 
very unreliable since they partake of the errors of both measures from 
which they were obtained, and many factors may all but swamp the 
influence of education on the IQ. But giving due weight to all of 
these inadequacies and sources of error, had the effect of school 
progress on IQ been even moderate, some tendency to positive 
correlation should have appeared. 

If differences in educational progress do not account for the changes 
in IQ, what factors do? Irregularities, spurts, and retardation in 
mental growth are possible but improbable explanations. More 
likely causes of variations are errors in measurement, and of these 
there are three types; those due (1) to defects in the measuring instru- 
ment;! (2) to mistakes and misjudgments of the examiner; and (3) 
those due to variability in human performance. Asin measuring other 


1 For example, coarseness of the steps on the scale—see Cobb, M. V.: One 
Element in the Probable Error of a Mental Age Measurement. Journal Educa- 
tional Psychology, April, 1922. 





Predictive Value of Certain Tests 535 


human abilities, results are approximations, more or less close, to real 
ability. Even in measuring speed of running 100 yards, sources of 
error, the same in type if not in magnitude, are encountered. Stop- 
watches are not always perfect; timers are less so, and no athelete can 
run with exactly the same speed on each of several occasions. Both for 
speed of running and mental ability, approximate measures are 
extremely useful. 

The results of the study of the relations of educational gains and 
changes in scores in mental tests disclose the peculiar value of the 
Stanford-Binet Test. It appears to operate in relative independence 
of the ordinary variations in school attainments. Based on content 
not taught in schools, the Binet Test is more widely applicable; it 
enables us to make comparisons in disregard of any save extreme 
variations in educational history, formal and informal. The MA and 
IQ are universal currencies whose validity, far from perfect, is 
nevertheless surprisingly useful, especially for predicting general 
scholastic achievements over relatively long intervals. 

The group intelligence tests, of the NIT type, when utilized not 
only for immediate classification, but for purposes of future prediction, 
should be employed advisedly following careful scrutiny of the condi- 
tions which obtained among the pupils to be tested. As Whipple 
has pointed out,' comparisons of results with general norms are usually 
less significant than comparison with norms established within the test 
situation itself. To compute IQ’s and MA’s, and to disregard previous 
test experiences of the pupils, is to incur a danger of serious error.’ 
Properly used the NIT, and others similar, are extremely useful 
instruments for measuring general scholastic capacity—indeed, any 
test of scholastic achievement has a substantial value for appraising 
general ability. 

The Nature of Factors Which Cause Correlation among School 
Functions.—The last question, more theoretical than the others but not 
without practical implications, concerns the nature of the cause of 
correlations among the tests, particularly the tests of achievement. 
The correlations established between the several school functions 
(See Table III) may be attributed conceivably to common psychologi- 


1 Whipple, G. M.: The Natural Intelligence Tests. Journal Educational 
Research, June, 1921. 

2 The enormous errors to which careless practice will lead is indicated in Gates, 
A. I.: The Unreliability of MA and IQ based on Group Tests of General Mental 
Ability. Journal of Applied Psychology, March, 1923. 


t) 
4 


i! 








~_——«s 











536 The Journal of Educational Psychology 


cal factors. Thus the high correlation between reading rate and read- 
ing comprehension, or between reading rate and arithmetic, may be 
most simply explained by assuming that each shares identical abilities 
with the other. But should the term be singular or plural? Have we 
in each case a common factor, one and the same, shared in different 
amounts by different pairs of subjects? Or one to recognize a multi- 
plicity of abilities, some or all of which are found in varying amounts 
in different subjects? 

These questions raise an old dispute in which Spearman is recog- 
nized as the leading defender of the single common factor belief and 
Thorndike as the exponent of the multiplicity of factors doctrine. 
The dispute has not been settled partly because of insufficient crucial 
data and, perhaps mainly, because there is not complete accord as to 
just what constitutes ‘‘sufficient proof’’ for either theory. Various 
criteria have been proposed; few accept any one as satisfactory. The 
most that may be done is to appraise the present data by several 
of the criteria which stand in reasonably high repute. 

As a first step, evidence may be offered in opposition to the notion 
that our intelligence tests measure a common mental capacity or general 
intelligence conceived as a unitary factor, completely and exclusively, 
and that the various linguistic and abstract school subjects depend 
mainly or entirely upon this same general capacity. If this were true 
in the extreme, arithmetic or any other subject, would not predict 
itself, as it demonstrably does, better than would a test of intelligence 
or a test of some other subject. Furthermore, the two intelligence 
tests are not correlated with each other perfectly, but to the extent, 
r = 0.86, which is the average of four correlations each corrected for 
unreliability of the measures. When age, which contributes to this 
correlation, is eliminated by partial correlation,! the r becomes 0.711. 
Finally, the two tests give not only correlations of different magnitudes 
with scholastic tests, in the average, but they give different rela- 
tive correlations with the several subjects as shown, for example, in 
Table ITI. 

These considerations have a bearing on certain practices, particu- 
larly the Accomplishment Ratio procedure, which sets up an intelli- 
gence rating as the criterion of achievement. It is apparent that, at 
present, this procedure is valid only for rough appraisals; and more 
valid for evaluating general educational achievement by the consolida- 


1 By means of the usual formula r 12.3 = rl2 — r13_ 123 





given by Yule. 





V1l-ruvl — 795 





Predictive Value of Certain Tests 537 





tion of several subjects than for gauging the attainment in a particular Rit 
subject. ay 

These considerations inasmuch as they are confined to and limited 
by the validity of the particular tests used, bear but slightly, if at all, 
on the general problem of the nature of the factors which cause correla- 
tions among mental abilities. The theoretical common intellectual 
factors may be very- imperfectly represented by any of the intelligence 
tests now in use. 

Spearman has suggested, as a test of his theory, the closeness with 
which a series of intercorrelations of mental abilities approximate a 
hierarchy which, he assumes, would be obtained if his theory were 
correct. The theory demands that if the correlations of a number of 
mental functions are arranged in a descending order from left to right 
and from top to bottom, as is usually done in a table of intercorrelations, 
in every row and in every column, the correlations should be in the 
same descending order. This means that in the table of inter-correla- 
tions, each column will show a perfect correlation with every other 
column: 

To construct such a table, the intercorrelations between each pair 
of educational tests were averaged. Each mean coefficient, which 
is the average of 36 separate correlations, has been corrected for atten- 
uation. They are arranged in Table X. 














TABLE X.—SHOWING THE MEAN INTERCORRELATIONS (Each COEFFICIENT THE 
AVERAGE OF 36 1r’s) CORRECTED FOR ATTENUATION AND ARRANGED TO SHOW 
THE PRESENCE OR ABSENCES OF A HiprarcHy. THE NUMBERS IN 
Brackets SHow THE RANKS OF 1r’s IN THE COLUMNS 
































| 
1 
, 2, 3; 4, 
Comprehen-| svithmetic | Rate Spelling 
§10n | | 
1 Comprehension.......|....... .877(1) 863(1) .815(2) he 
2 Arithmetic........... Pt Re eee .775(3) .778(3) Ke ; 
CS aioe iat oD aoe so 4 .863(2) er . Setees ss .818(1) HP 
4 Spelling..............|  .815(3) .778(2) .818(2) nt 
ah 
i 
The correlations among the columns are not perfect; indeed, they rit 
give the appearance of a chance arrangement. The ranks, reading it) 
from the left to the right columns are: 1, 2, 3; 1, 3, 2; 1, 3, 2; 2, 3, 1. A 


The suggestion, then, is that the correlations of these tests are not due 


>. 


ni Sa 














538 The Journal of Educational Psychology 


to a factor, everywhere one and the same, but to the presence of many 
factors which variously combined make up different functions. 

Spearman’s view is that mental functions embrace two kinds of 
factors, a general factor shared by all in different amounts thus 
causing correlation and a specific factor or factors rarely shared by 
two or more different functions. Thus, it should follow that by elimi- 
nating from the correlation between 1 and 2 the association due to a 
third function, which gives about equal correlations with them, the 
result would be zero, or approximately zero, correlation. By means of 
partial correlation it is possible to make such eliminations which we 
have done with results as shown in Table XI. 


TABLE XI 


Partial Correlations First Order 


Comprehension with arithmetic (rate eliminated) 
Comprehension with arithmetic (spelling eliminated) 
Comprehension with rate (arithmetic eliminated) 
Comprehension with rate (spelling eliminated) 
Comprehension with spelling (arithmetic eliminated) 
Comprehension with spelling (rate eliminated) 
Arithmetic with rate (comprehension eliminated) 
Arithmetic with rate (spelling eliminated) 
Arithmetic with spelling (comprehension eliminated) 
Arithmetic with spelling (rate eliminated) 

Rate with spelling (comprehension eliminated) 

Rate with spelling (arithmetic eliminated) 


ouunuaud vu uv dd 
cosoosoooooesco 
m © Ww & 


~ PID & aI 
SESRSLERSEGS 


Partial Correlations Second Order 


Comprehension with arithmetic (rate and spelling eliminated) 
Comprehension with rate (arithmetic and spelling eliminated) 
Comprehension with spelling (arithmetic and rate eliminated) 
Arithmetic with rate (comprehension and spelling eliminated) 
Arithmetic with spelling (comprehension and rate eliminated) 
Rate with spelling (comprehension and arithmetic eliminated) 


0.63 
0.51 
0.23 
—0.05 
0.23 
0.41 


There are common elements, among these functions, as the decrease 
in the partial coefficients show, but the relations are such as suggest not 
the presence of a single common element but many elements variously 
shared by different functions. Thus comprehension and arithmetic 
show a substantial correlation when the factors common to both rate 
and spelling have been removed; rate of reading and spelling are 
correlated (r = 0.41) by factors independent of comprehension and 
arithmetic. There is apparently no single common factor running 





\w 


vs 





Predictive Value of Certain Tests 539 


through all of these tests which accounts in a complete way for the 
relations among them. Apparently we must conceive mental ability 
as a multitude of specific abilities of which a large number are active 
during each mental act—some common tomany,sometofew. General 
intelligence is the sum of such a multitude; as actually measured by 
tests it is the sum of asampling ofthe whole. These aretheimplications 
of the data which give no conclusive result. Although the coefficients 
are highly reliable, the functions tested are too few and the relations 
to factors, educational and other, here uncontrolled, are too complex 
to make possible more than a suggestion of the facts. 











oe 


Or es 


ae — ae 
o_o 








ee en er a 


_ eke meen rte 
ime ae 





A STUDY OF THE RELATION BETWEEN ABILITY 
TO LEARN AND ga Nae AS MEASURED ~~ 
B E 


O. J. JOHNSON 
Division of Research, Public Schools, St. Paul, Minnesota 


One of the fields of investigation as yet scarcely touched is 
the relation between intelligence and efficiency of learning. If group 
intelligence examinations measure the kind of traits required in learn- 
ing, it should be determined under experimental conditions how 
important the abilities in question are for various types of material 
and under different conditions. Such knowledge would be of great 
help to educational psychology in supplementing data already at hand 
on the relationship between mental ability and scholarship as measured 
either by marks or scores on achievement tests. It would also help to 
clear up points in dispute which retard the advance of intelligence 
testing and would direct such activities into more fruitful channels. 

The experiment reported below is an attempt in the indicated 
direction. It deals with the acquisition of new habits in the case of 
material which was already familiar; 7.e., in learning to read inverted 
print in a mirror. The subjects were 60 university students in the 
writers class in educational psychology; 12 were men and 48 were 
women. The experiment, carried on as part of the course, was 
conducted as follows: 

All practice was done out of school hours. The student was 
directed to place a good mirror directly in front of himself so that it 
faced him. The book used was Starch’s ‘‘ Educational Psychology’”’ 
and this stood right side up on the table so that it faced the mirror. 
The student’s task was to read the page by looking at its inverted 
image in the mirror. A day’s practice consisted of 10 minutes work 
and keeping careful record of the number of words read. The experi- 
ment continued during 20 days of practice. 

At first the reading was found to be very confusing. It was neces- 
sary to make a number of adjustments which conflicted with long- 
established habits. Not only was the print inverted, but lines had to 
be read from right toleft. It was also very hard to distinguish between 
such letters as ‘‘p”’ and ‘‘q,” “d” and “b,” and “a” and “ss,” not so 
much on account of the difficulty of seeing them as of the mental 
confusion momentarily occurring. 

During the course of the experiment a number of group 
intelligence tests were given to the class. They were Army Exami- 


540 








\\ 


Relation Between Ability to Learn and Intelligence 541 


nation Alpha, Form 8; Thurstone Psychological Examination, Test 
IV; Haggerty Reading Examination, Sigma 3; Van Wagenen Asso- 
ciation A Tests! and two unpublished forms of Geometrical Figures 
Tests devised by the writer. The average score on all of these tests 
formed the mental ability standing of each student. 

At the conclusion of the 20 days of practice, correlations were 
figured between the average scores on all the tests and performance 
in mirror reading. This was done twice, first with the average number 
of words per day, and second with the improvement in ability to read. 
For this, the difference between the average number of words read 
during the first three days and last three days of practice was used as 
the measure. The correlations were as follows: 


With average number of words read per day .34 + .08 
With improvement in ability toread . . . . 46 + .07 


These correlations were calculated according to the Pearson 
Product-Moments Formula. They show that there exists a fairly 
large positive relation between the ability to become efficient at learn- 
ing to read inverted print and intelligence as it is measured by the 
usual group tests. It is interesting to note that it is not the absolute 
amount which a person reads that is most important in this connection, 
but rather the amount of improvement. In other words, it is rapidity 
of learning—acquiring new connections—that is most closely related 
to mental ability. 

In order to attack the problem in a somewhat different way, the 
curve of improvement was drawn for all 60 students as a group. In 
Fig. 1 this curve is represented by the heavy line in the middle. A 
glance at this curve indicates that the average increase in ability to 
read was regular from day to day; furthermore that the students had 
not reached their limit of improvement when the experiment ended. 

The data for the other curves in Fig. 1 were secured as follows: 

Figure I shows the performance in mirror reading for different 
groups of a university class arranged according to their mental 
abilities as indicated by test results. 

Curve 1 is for the 30 students who scored above the class average 
in the tests. 

Curve 2 is for the 30 students scoring below the average class. 

Curve 3 is for the 15 highest students. 

Curve 4 is for the 15 lowest students. 





1Van Wagenen, M. J.: Graded Opposites and Analogies Tests. Journal of 
Educational Psychology, Vol. XI, May-June, 1920. 





























BR ag a= 


at BE RIE OSE COO ME Fe 


oye > 
oe eS 


— 


EE AO EER NEN IEE YTS Es PIANOS 


ea Pere tiee = ae 


~ 7 


a 
er 





542 , The Journal of Educational Psychology 


The heavy central curve is for the total group of 60 students 
composing the class. 

The students were divided into two groups of 30 persons each. 
One group was composed of the students standing above the class 
average in the intelligence tests. Their curve is labeled with the 
figure 1. Similarly curve 2 shows the improvement of the students 
who fell below average in the tests. 

In studying these curves, one would say that the relation between 
mental ability and efficiency in learning to read is fairly close, because 
it is evident that the students who stood above the average in intelli- 
gence started with greater initial performance in mirror reading and 


7 


AVERAGE WUMBER OF WORDS READ PER DAY 





that they progressed considerably faster. While this is true when 
students are considered as groups, individual progress records show 
that there are large variations among those who made very nearly the 
same scores on the mental tests. Further treatment of the data will 
bring out these facts more clearly and confirm the results already 
secured by the method of correlation. 

Selection was next made of the students who ranked in the upper 
and lower quartiles in the intelligence tests. It was thought that the 
difference between the rates of learning to read would be greater in the 
case of these groups than between the upper and lower halves. How- 
ever, curves 3 and 4 in Fig. 1 do not bear out this assumption. The 
striking thing is the lack of difference between the reading abilities of 





Relation Between Ability to Learn and Intelligence 543 


the groups composing the upper one-fourth and upper half of the class, 
or of the lowest fourth and lower half. In short, it appears that rather 
wide differences in mental ability do count in work of this type, but 
that small differences are overbalanced by other factors. 

It was thought worth while to check the results just mentioned in 
yet another way which if possible would show still more clearly the 
relation with intelligence as distinguished from other factors. In this 
case, the scores of students were arranged in pairs. That is, 
two students were selected who made identical, or nearly identical, 
scores on the intelligence tests. In this way, it was possible to pair all 


ee oye 5 a 5 om 
- ooh) we ete * 
“+: rape 


— 


MVERAGE WUNBER OF words FEAD PER DAY 





students, making two groups of 30 students each which may, for 
practical purposes, be considered to be equal in intelligence. The 
curves of improvement for these groups are shown in Fig. 2. 

Figure II shows the performances in mirror reading of two groups 
of thirty students each. These students were paired on the basis of 
equality in intelligence as measured by the average of several tests. 
This made two groups of equal mental abilities. 

If the ability to read print in a mirror is to some extent due to intel- 
ligence, we should expect to find less divergence in performance 
between the mentally equal groups of Fig. 2 than between the groups 
in Fig. 1. That this is true in general is clearly shown by inspection of 
the graphs. Mental ability undoubtedly enters to a considerable 
extent. If the correlation were perfect, the two groups in Fig. 2 should 

















ay 


- 
“it 
G 
i 


544 The Journal of Educational Psychology 


make identical records throughout; but this is not the case. One 
group is uniformly superior throughout. This, one is forced to con- 
clude, must be due to other traits than those measured by the intelli- 
gence tests. The results are however remarkable in showing such a 
close correlation, when one considers the rather mechanical and unin- 
teresting nature of the task of learning to read inverted print of a 
difficult thought content. 


Conclusions 


1. We need carefully conducted experiments to show the relation 
between different types of learning and intelligence. 

2. The results of this study suggest that in any experiment on learn- 
ing account should be taken of the intelligence of subjects before results 
are worked out or conclusions are drawn. 

3. It suggests one reason for the lack of agreement between results 
of investigations similar in nature. but where the subjects were 
different. 

4. It raises a question as to the validity of much experimental work 
done in the past in which no attention was paid to the mental abilities 
of the subjects. 





SCALES FOR MEASURING JUDGMENT OF 
ORCHESTRAL MUSIC 


M. R. TRABUE 


Director of the Bureau of Educational Research, University of North Carolina 


Insofar as schools undertake to change the musical tastes of their 
pupils, it is desirable that scales be available for measuring the extent 
of the changes that occur during any period of training. Seashore 
has devised tests which measure ‘‘ Musical Talent” in a number of its 
elementary phases, and other investigators have attempted to devise 
measures for musical intelligence and for ability to recognize moods in 
music, but as yet no one has reported a test that measures ability to 
distinguish between good music and poor music. Although it is 
important that the school should discover pupils whose talents make it 
possible for them to become producers of music, it is equally important 
that it should measure the extent to which its efforts to improve the 
tastes of consumers of music are successful. It was in an effort to 
supply such measuring instruments for taste or judgment in orchestral 
music that Mr. M. L. Mohler began the present study.! 

Nature of the Mohler Tests—Mr. Mohler’s tests for measuring 
ability to judge orchestral music involve the use of phonographic 
records of 16 different musical compositions. The relative merits of 
these records have been determined and have been assigned numerical 
indexes by combining the judgments of expert musicians with the 
judgments of other intelligent adults in a manner to be described 
later. Records regarding whose merit the experts and the intelligent 
laymen were not in substantial accord have been omitted from the 
scales here proposed for general use. The tests as presented assume, 
therefore, that one record is better than another if it was considered 
better by both the musical experts and the other intelligent persons 
whose judgments are reported in the following pages. 


1 Mr. Mohler planned the investigation and completed the field work in 1920 
under the joint direction of Professors Thomas H. Briggs and Truman L. Kelley. 
Unfortunate circumstances prevented the completion and publication of the study. 
The results, however, seemed to the present writer so valuable that he asked and 
received from Mr. Mohler permission to prepare the report for publication. 
The reader will therefore find in the report certain omissions and imperfections 
which are due to the fact that the work was done by different persons at its various 
stages. 

545 


’ 
i | 
HY 




















- . eo 


= wees 








I 


~- _ near © 
st Se hoes 


i 
ie 
A 
a 

Q 

; 
? 


546 


The Journal of Educational Psychology 


The complete list of phonograph records used in this study and the 
numbers by which they will be designated in this report are 














given below: 
Record number * Name of selection Composer 
1 Valse, from Ballet Music of “‘Faust’”’....... Gounod 
2 Hunt in the Black Forest..................| Voelker 
3 Wedding of the Rose..................... Jessel 
4 Unfinished Symphony, First Movement..... Schubert 
5 ek aah Lk h kee hoe Seeks Walter E. Miles 
6 , 0 RE hy iee weyers Nevin 
7 SS IPAS SAR OAT REI Og a A 1 Skilton 
8 Turkish March, from “Sonata in A Major’’.| Mozart 
9 How Beautiful Art Thou.................. Bonincontis 
10 EE Mieke 6 SE hans dine Cad ait Wm. Penn 
11 es 
12 Largo, from ‘‘ New World Symphony”’...... Dvorak 
13 a ES ee eee Loraine 
14 Sounds from the Music Room.............. Smith 
15 Triumphal March, from “‘Aida’’........... Verdi 
16 Introduction, Act III, “Lohengrin”’........ Wagner 
17 an os iy aa ep ene +h Ivanov 
18 nee 9a pe Resear cdo hed Durand 
19 Anitra’s Dance, from “‘ Peer Gynt Suite’’.. . .| Grieg 
20 Nightingale and Frogs.................... Eilenberg 
21 ores, bis at erie tes ach . Séleln sins 400 MATER > <0 0lo doa eed 
22 ES ES PCE PPO | F, Vee 





1 These numbers are for convenient reference only. They have no relation 
whatever to the quality of the music. 


The tests are arranged and administered in such a manner that a 
person who can detect small differences in the general merits of two 
selections will receive a high score, while one who can detect only the 
larger differences will receive a lower score. Three or four records 
are played in a group, one record immediately after the other. Each 
listener is asked, when a group of records has been played, to make a 
note indicating which record he considered ‘‘ best,’’ which ‘‘ next best,”’ 
and which “‘poorest.”’ The differences in musical values of the records 
in the first group in each test are relatively large, while the differences 
in each succeeding group are smaller and smaller. The number of 
points allowed as a score for successfully rating each group of records 








Scales for Measuring Judgment of Music 547 


is dependent upon the size of the errors in one’s judgment, no eredit 
at all being given if the person has rated the records in exactly the 


Oo f 






















































































-2.0 -/.0 re) 41.0 -2.0 -/.0 
A eo ee ee Sees 
' @ors-°* * . ia aa 
B 1 — be B ™ “Wennaneed 
C aa C ee 
*Seees,. > -_ -e 
oe” - 
D ff D ipl 
UA J s 
Fig. 1. 
Scale Alpha Scale Beta 
“4,0 -/.0 O +/.0 ~2.0 ~/1,0 ° +i. 
A oo |, en 
B eo aud B ' 
‘eg Pia c ~ bi. 
+. .| 
D Swen, | D ~, 
er 4 
E aa 
4 d 
Fig. 2. 
Scale Gamma Scale Delta 
Relative Musical Values Relative Musical Values 


reverse order from that which is considered correct, and maximum 
credit being given when the correct order has been accurately indi- 
cated. A low score therefore indicates little or no success in judging 








= 





~ ae 
a — 


——— > 


. = = Se. 
ee 


— 
4 -. 


Sethe ae aneeeieell 


ee ee 


a 





TETAS eg ORE aS 


ct ~egioilte teeicntenine a = 
SLE ID Sa ET eth, OEE BSE oe 


ey reer 


—— ——— “a 


a om per 
SRR ae Oe E- 


IE orton = 
Per: 


a 


Ll 


548 The Journal of Educational Psychology 


the relative merits of these orchestral selections, while a high score 
indicates a greater degree of success. 

The order in which the records are to be played in giving a test 
and the calculated musical values of each record are given in Tables I 
and II and in Figures 1 and 2. Figure 1 and Table I give the arrange- 
ment of Scales Alpha and Beta, while Figure 2 and Table II give the 
arrangement of Scales Gamma and Delta. Scale Alpha and Scale 
Beta each consists of four groups, each group containing four records, 
while Scale Gamma and Scale Delta each consist of five groups, each 
group containing three records. In the diagrams, distance to the right 
indicates increasingly high quality, while the distance from the top 
indicates the order in which the records are to be played. In Group 
A of Scale Alpha, for example, the second record is the best and the 
third record played is the poorest. For each group in Scale Alpha 
there is a corresponding group in Scale Beta, composed of different 
records but having the same general arrangement and intervals of 
quality. Similarly, for each group in Scale Gamma there is a 
corresponding group of records in Scale Delta. 


TaBLE I.—RELATIVE VALUES AND INTERVALS BETWEEN VALUES IN SCALES ALPHA 








AND BETA 

Group | Record | Relative | Interval | Group | Record | Relative | Interval 

number | number | quality A number | number| quality A 
Al 19 + .28 + .92 Al 15 — .35 + .90 
A2 4 +1.20 —2.58 A2 18 + .55 —2.53 
A3 13 —1.38 | + .90 A3 2 —1.98; + .81l 
A4 8 — .48 + .76 A4 9 —1.17 + .82 
Bl 2 —1.98 — .8l Bl 5 — .90 + .42 
B2 9 —1.17 +1.45 B2 8 — .48 +1.68 
B3 1 + .28 — .83 B3 4 +1.20 — .92 
B4 3 — .55 —1.43 B4 19 + .28 —1.18 
Cl 10 — .79 — .59 Cl 3 — .55 — .48 
C2 13 —1.38 +1.38 C2 17 —1.03 +1.31 
C3 12 0 — .35 C3 1 — .28 — .28 
C4 15 — .35 — .44 C4 12 0 — .55 
D1 3 — .55 — .24 D1 15 — .35 — .26 
D2 10 — .79 — .24 D2 16 — .61 — .18 
D3 17 —1.03 — .32 D3 10 — .79 — .38 
D4 20 —1.35 + .80 D4 9 —1.17 + .82 





























n-~. Game of 2 Geos CR 2. 1 OhOClCUD 





Scales for Measuring Judgment of Music 549 


Taste II].—Rewative VALUES AND INTERVALS BETWEEN VALUES IN SCALES 
GAMMA AND DELTA 








Group | Record | Relative | Interval | Group | Record | Relative | Interval 

number | number | quality A number | number} quality A 
Al 4 +1.20 —2.55 Al 18 + .55 —2.53 
A2 20 —1.35 +1.35 A2 2 —1.98 +1.08 
A3 12 0 +1.20 A3 5 — .90 +1.45 
Bl 18 + .55 — .90 Bl 4 +1.20 — .92 
B2 15 — .35 — .82 B2 19 + .28 — .76 
B3 9 —1.17 +1.72 B3 8 — .48 +1.68 
Cl 2 —1.98 + .60 Cl 9g —1.17 + .56 
C2 13 —1.38 + .59 C2 16 — .61 + .61 
C3 10 — .79 —1.19 C3 12 0 —1.17 
D1 5 — .90 + .90 D1 20 —1.35 + .80 
D2 12 0 — .48 D2 3 — .55 — .48 
D3 8 — .48 — .42 D3 17 —1.03 — .32 
El 15 — .35 — .26 El 10 — .79 — .38 
E2 16 — .61 — .42 E2 9 —1.17 — .21 
E3 17 —1.03 + .68 E3 13 —1.38 + .59 





























Scores in the Tests.—Several schemes for obtaining a numerical 
score were tested, and the one finally selected and recommended for 
general use is neither the simplest nor the most accurate possible. 
It is, however, a compromise which involves both of these merits in a 
satisfactory degree. In any given group of records played, the only 
judgments that are allowed to count in the determination of one’s 
score are his ratings of the best record and of the poorest. This 
method fails to employ all of the information available, especially in 
Scales Alpha and Beta where each group has a ‘‘next best”? and a 
“next poorest”’ record. An actual trial of scoring methods based on 
ratings of all the records in a group failed, however, to increase per- 
ceptibly the coefficient of correlation between two measurements of 
the same group of persons. The present scheme was therefore adopted 
as being reasonably accurate and yet simple enough for practical 
purposes. 

The method for obtaining a score can best be explained by illustra- 
tions. If the relative values of three records in Scale Gamma are in 


a 


2 Ee 208 pee ee a IS eS 


= 
~~ 

SOs 

en 


* SE =B. iL 


=~ FROWRENEA = 


OR i 











= +> ee > 


et es 








= pre -s 
TS 


ow. 





a. oo 
~~ a eS Ro 
— er ree 








mw an 
—_— 
> onn~ahetin — 























ms oor ote * 
ee Sree 
73 SR re 


~ Site outa. —— ~ 
. on 


fon ae rota nage 
= - 


550 The Journal of Educational Psychology 


the order 1-2-3 from the poorest to best, and a listener rates them in 
1-2-3 order, he is to be given 4 points credit for that group—2 points 
for having judged the best record correctly, and 2 points for having 
judged the poorest one correctly. If the record of intermediate value 
is judged “‘best,’”’ however, while the best one is rated as ‘ middle,”’ 
2 points are to be given for having located the poorest correctly, and 1 
point for having made only a one place error in rating the best record. 
In short, 2 points of credit are to be allowed for placing a best record 
(or a poorest record) in its correct position with reference to the other 
two, 1 point for placing it in the position next to its rightful one, and 
no credit at all if it is placed at the wrong end of the group. The 
maximum credit for each of the five groups in Scale Gamma (or Scale 
Delta) is therefore 4 points. If the first record were poorest and the 
third one best in a group, a 1-2-3 rating would receive 4 points credit, 
a 1-3-2 rating 3 points; a 3-1-2 rating 1 point, and a 3-2-1 rating no 
credit at all. 

The method of scoring for Scales Alpha and Beta is similar to that 
for Gamma and Delta in that only the ratings of the best and of the 
poorest records are considered. Three points are to be allowed for 
having rated the best record as “‘best,’’ two points for having rated it 
“next best,’’ one point for having rated it ‘‘next to poorest,” and no 
credit for having judged it “poorest.’”’ The maximum credit foreach of 
the four groups in Scale Alpha (or Scale Beta) is therefore 6 points. If 
the correct order in a group were /—2-3-—4, from poorest to best, then a 
6 point credit would be granted for a rating of 1—-3—2-4, a 5 point credit 
for a 1—X-—4—X' rating, a 4 point credit for a 1-4-X—X or an X-1-4-X 
rating, a 2 point credit for an X—/—4—X, a 1 point credit for a 4-X-1-X 
rating, and no credit for a 4~X—X-1 rating. 

The maximum possible score in Scale Alpha or in Scale Beta is thus 
seen to be 24 points, while the maximum in Scale Gamma or in Scale 
Delta is 20 points. For practical purposes one may consider Scales 
Alpha and Beta as equivalent to each other. Scales Gamma and Delta 
make up another pair, differing from Alpha and Beta but comparable 
with each other. No table has yet been worked out for converting 
Alpha or Beta scores into equivalent Gamma or Delta scores, or vice 
versa. It is recommended that Gamma and Delta be used where 
possible, because of the greater ease with which people are able to judge 
the relative merits of three selections rather than four at once. 


' The X in these series stands for either of the two records having values above 
the poorest but below the best in the group. 





_ 


at tebe th. oft ot oct wos 








* 


de 


a 


ovo eo Ww Oo ww DY 


Scales for Measuring Judgment of Music 551 


Sixty pupils in Grade VI of the public schools at Hackensack, 
New Jersey, were tested with a preliminary form of Scale Beta. 
Their scores ranged from 3 to 22, with a median value of 10.8 and 
a semi-interquartile range of 3.2 points. Two hundred and thirty 
pupils in high schools of New Jersey and New York City were 
tested with preliminary forms of Alpha and Beta. Their scores 
ranged from 4 to 23, with a median value at 12.9 and a semi-interquar- 
tile range of 3.4 points. These scores are not satisfactory standards 
for the present forms of Alpha and Beta, but they are given here as 
indications of approximately what may be expected from the use of 
these scales. 

Scales Gamma and Delta have been employed in their present form 
in testing a few students in North Carolina. The results obtained in 
the schools of Greensboro are given in Table ITI. 


Tas Le II].—Scores 1n JUDGMENT OF ORCHESTRAL Music at GREENSBORO, N. C.! 
Scales Gamma and Delta 


















































Type of school Public elementary || N.C. C. W. Training Public High 
School grade 5 6 7 5 6 7 | I II Ill IV 
ra| 
Number of pupils....|| 63 87 216 29 31 24 103 57 41 15 
Median score........ 8.1] 9.6] 9.7 9.3 | 10.6 | 12.0 || 11.4 | 11.9 | 11.9 | 12.1 
Pidihniee ae étehe ene | 2.6 1.8 2.0 1.3 1.3 | 1.9 | 2.2 2.4 1.9 1.2 











1 There are only seven grades in the elementary school system in North Carolina. 


It is interesting to note that the pupils in the Training School of the 
North Carolina College for Women had higher average scores than the 
pupils of the same grades in the public schools where less attempt had 
been made to develop musical appreciation. 

Table IV gives the scores of such college students as the writer has 
thus far been able to measure with Scales Gamma and Delta. In 


TaBLeE I1V.—Scores or COLLEGE STUDENTS IN MousICcCAL JUDGMENT 
Scales Gamma and Delta 


CECE Te = 

















College N. C. College for women | Univeraey 
Class | Freshmen Sophomores | Juniors | Sophomores 
Number of pupils.......... 30 16 68 46 


rk ie 6.40 0.0ckenees 10.9 11.0 11.9 12.0 






































552 The Journal of Educational Psychology 


general these college students are not superior to the Greensboro High 
School students, but this is probably because they have had no more 
training in music appreciation than is offered by the Greensboro 
High School. Elementary school pupils living near Philadelphia, 
New York, Boston, or other musical centers would undoubtedly do 
better on these tests than North Carolina college students, who as a 
rule have had very few musical experiences. 

Effect of Training.—The most significant finding of this study was 
that the characteristics measured by these tests are easily improved 
by training. With tests of a similar character in judging English 
poetry, it has been found that training has relatively little effect on suc- 
cess in the tests. In the case of orchestral music, however, there 
seems to be an unusually large opportunity for the improvement of 
taste through musical training and experiences. 

In Greensboro, N. C., the supervisor of music last year held a 
Music Memory Contest. One Grade VI class went into special train- 
ing for the event and won the prize. When tested this year with 
Scale Gamma, this Grade VII class made a median score of 14.1, with 
a Q of 2.0. This score is not only superior to the scores of other 
seventh grades in the city (9.7) but is higher than any group of high 
school or college students in the state have thus far made. Only two 
or three of the selections used in Scale Gamma were in the list of com- 
positions that had been studied in preparation for the Music Memory 
Contest. 

A controlled experiment was planned and conducted by Mr. Mohler 
to determine the effect of a short period of training on ability in judging 
music—ability being measured by the preliminary forms of Scales 
Alpha and Beta. Two groups of pupils in the same grade of the same 
school were tested by Scale Alpha (or Scale Beta) to make certain that 
they had approximately the same scores. One of the groups was then 
given a series of eight lessons in music appreciation, while the other 
group was given no special training in music. The lessons, which 
required 40 minutes each week for eight weeks, were carefully planned 
and administered, care being taken to avoid playing or discussing the 
selections used in the test. At the end of the training period, both 
the trained group and the untrained group were again measured by 
the scale previously used. ‘Table V shows the median scores and semi- 
interquartile ranges for both trained and untrained groups in three 


1 Abbott, Allan and Trabue, M. R.: A Measure of Ability to Judge Poetry. 
Teachers College Record, Vol. XXII, No. 2, March, 1921. 








ee ee 6° ee oe 


oO fF 


—— 
. 





Scales for Measuring Judgment of Music 553 


different communities where the experiment was performed, and the 
differences between these measures before and after training. 


TaBLeE V.—ErFrect or TRAINING ON ScorEes IN MosicaL JUDGMENT 





First test | Second test | Differ- | Differ- 








School and group Scale ry omy 
Median | Q | Median | Q median! Q 
Hackensack High School 
26 pupils, untrained........ Beta 12.3 {2.44 13.0 (3.5) + .7) +1.1 
45 pupils, trained........... Beta 10.1 {1.7} 20.1 |1.5) +10.0) — .2 
Hackensack Sixth Grade 
30 pupils, untrained........ Beta 10.7 {3.0} 11.0 |3.2} + .3) + .2 
30 pupils trained........... Beta 11.0 {2.8} 19.9 |1.3) + 8.9} —1.5 
Summitt High School 
21 pupils, untrained........ Beta 10.3 /|2.9) 12.5 (|2.9}) + 2.2 0 
17 pupils, trained........... Beta 10.5 |2.1) 18.8 /|2.1) + 8.3 0 
30 pupils, untrained........ Alpha | 16.3 (3.3) 16.0 (2.8) — .3) — .5 
40 pupils, trained........... Alpha | 14.7 (|3.9| 20.5 (|1.6) + 5.8) —2.3 
Horace Mann High School 
14 pupils, untrained........ Alpha | 14.7 |2.0) 14.5 |5.1) — .2| +3.1 
14 pupils, trained........... Alpha} 16.5 (3.2) 19.7 (|1.5) + 3.2) —1.7 


























In almost every case, the group that received training increased its 
median score 10 or more times as much as the control group. It is 
significant also that the control groups tended to increase their disper- 
sion or variability while the trained groups decreased their measures of 
dispersion. The smallest amount of improvement in any group 
trained was at the Horace Mann High School in New York City 
where the pupils had a relatively high score when the experiment 
started. If one accepts as fairly valid the relative values used here for 
the records in these scales, it is clear that a well arranged course in 
listening to music can in a short time work a great improvement in 
the accuracy of pupils’ judgments of orchestral selections. 

Derivation of the Scales —The fundamental assumption that formed 
the basis of the numerical values attached to each record was the 
theorem used by Hillegas in developing his English Composition Scale, 
by Murdoch in preparing her Scales for Hand Sewing, and by others in 
still other fields, that ‘‘Equally often noticed differences are equal, 

















= , tle - 
—-* - 
oT eee 








a a a ’ 


ah 
é 


i z 
i 
i 





554 The Journal of Educational Psychology 


unless never noticed or always noticed.” If 50 per cent of the judges 
thought record A better than record B, while the other 50 per cent 
thought B better than A, one could fairly certainly assume that the 
two were approximately equal in merit. If 60 per cent thought A _ 
better than B, and 60 per cent thought B better than C, then one 
could assume that A was just as much better than B as B was better 
than C. 

By accepting the useful assumptions that judges will in judging 
each record be distributed symmetrically about the true value 
according to the normal surface of frequency, and that the dispersion 
of the judgments on one record will be equivalent to the dispersion on 
any other, it was possible, by the help of statistical tables, to convert 
any percentage of “‘better’’ judgments into a numerical statement of 
the amount of difference between the two records, the difference 
being stated in terms of some measure of dispersion or variability of 
judgments as a unit. For the scales in judging music, the PE, or 
median deviation from the median, was chosen as the unit by which 
to measure the differences in quality between the records. 

Of 368 intelligent persons who listened to record number 1 and 
record number 2 played in the same group, for example, 329 persons 
(89.4 per cent) judged that number 1 was better than number 2, while 
the other 39 decided that number 2 was better than number1l. When 
89.4 per cent of a large group of judges decide that one record is better 
than another, one may assume that, for these particular judges at least, 
the one is distinctly better than the other. It is at this point that the 
table of values of the normal probability integral corresponding to 
values of X/PE becomes useful. Such a table shows that if 89.4 per 
cent of the judges think number 1 better than number 2, then number 1 
is for them 1.85 PE better than number 2. 

In a similar manner the other two records of this group of four 
were evaluated by means of the distribution of judgments on record 
number 1. Record number 3 was considered poorer than number 1 
by 60.6 per cent of the group and was therefore rated as .4 PE poorer, 
while record number 4 was judged poorer than number 1 by only 30.7 
per cent of the group, which indicated that it was .75 PE better than 
number 1. As measured by the distribution for record number 1, 
therefore, the poorest record in that group was number 2, which was 
1.85 PE below number 1. The next was number 3, which was .4 PE 
below number 1, and the best was number 4, which was .75 PE above 
number 1. 





|, a. en ie Oe | 


iain tole ot fe ane et 6.ok.lCUlUC<Ci KCC 


oi 


Scales for Measuring Judgment of Music 555 


The differences between the four records in each of the other groups 
were similarly determined and stated in PE units. Each set of four 
records played as one group contained one of the records played in 
the first or second group, which made it possible to relate the values 
of records in any group to the values of records in any other group. 
The final value assigned to each record shows its average relation to 
record number 12, which was chosen as a general reference point 
because of its location near the center of the values for the “really 
good selections.” 

In actual practice the calculation of values was more complicated 
than the above discussion would indicate. In the first place there 
were two sets of judges. The expert judges agreed more closely 
among themselves, which resulted in a shorter PE ,which in turn caused 
the differences between selections to have larger indexes when 
measured by the distribution of expert judgments than when measured 
by the distribution of ordinary judgments. This difference between 
the two groups of judges was met by taking as a final value the arith- 
metic mean or average of the two determinations. Record number 15, 
for example, was judged by the experts as .68 PE below the reference 
point (number 12) and by the other judges as .02 PE below this point. 
The final value assigned to number 15 was therefore —.35 PE from 
number 12. ; 

For both types of judges and within each group of four records 
played together, four distributions were available for measuring the 
differences between the four records, and these did not agree perfectly 
even in their indications of the order in which the records should be 
ranked. The distribution of non-expert judges for record number 17, 
for example, indicated that the correct order from poorest to best in 
that group was 17—3-16—15, while the distribution for record number 
16 indicated a 17-16—15-3 order, and the other two distributions 
each indicated 17-16-3-15 as the correct order. It is needless to add 
that the amount by which No. 16 differed from No. 17 had at 
least three different solutions. 

In order to provide auniform rule for action it was decided 
to measure the difference between any two records in a group primarily 
by the evidence of the distribution for the two records concerned. 
Assuming for example, that number 16 was next in quality above 
number 17, the direct comparison showed that 64.4 per cent considered 
number 16 the better, and that one might therefore rate it as .55 
PE better, so far as these two distributions were concerned. But a 












































: 
\ 
‘ 
- 
’ 
4 








RE ee, hes 


SRO EAE 


= 


FRE aR 


Dt ORS el eS 


a 
A 
\ 


ae tee 


556 The Journal of Educational Psychology 


similar direct comparison showed number 15 to be .67 PE better than 
number 17, and comparison of number 16 and number 15 showed num- 
ber 15 to be .10 PE better than number 16. Another useful measure 
of the difference between number 17 and number 16 was therefore 
indirectly available by subtracting .10 PE (the distance from number 
16 to number 15) from .67 PE (the distance from number 17 up to 
number 15), the result being .57 PE. Inasimilarmanner, bysubtract- 
ing the difference between number 16 and number 3 (.24 PE) from that 
between number 17 and number 3 (.49 PE), still another useful indirect 
measure (.25 PE) was found of the superiority of number 16 over 
number 17. Twice as much weight was arbitrarily given in each case 
to the direct measurement as was given to each of the other two 
determinations just described. The final measure of the interval from 
number 17 to number 16 was therefore .48 PE[(2 X .55 + .57 + .25) 
+4 = 48]. 

There were still other complicating factors in determining the 
values of the intervals of quality between records, but there is not 
space in this article to describe fully the treatment accorded to each 
one. Tables VI and VII give the original data concerning the judg- 
ments used. The non-expert judges were for the most part teachers 
and normal school students, some of whom had received musical 
training, although most of them were studying other branches of 
education. Classes at Teachers College, Columbia University, sup- 
plied some of these judges, while others were from the normal schools 
at Worcester, Hyannis, Framingham, Fitchburg, Bridgewater, Boston, 
and Providence. All of these judges were graduates of standard high 
schools and at least 16 years of age. Most of them were women. 
The expert group was composed of selected music supervisors, teachers, 
writers, and publishers attending the sessions of the Eastern Music 
Teachers Association at New York City, in May, 1920. It is unfor- 
tunate that the number of expert judges was not larger. 

Table VIII and Figure III show the final values of the records as 
calculated from the data given in Tables VI and VII, all values being 
stated in terms of record number 12 as a constant point of reference. 
Record number 1, for example, is .36 PE above number 12 if the judg- 
ments of the experts are used and .21 PE above if the non-expert 
judgments are employed. The value finally used is midway between 
these two, at .28 PE above number 12. Table VIII also shows the 
number of times each record is played in administering each of the 
four scales. 


Fr 





Scales for Measuring Judgment of Music 


557 


TaBLeE VI.—Ratines oF PHONOGRAPH RECORDS BY NON-EXPERT JUDGES 
Frequency with which records listed on horizontal scale were judged better than 


the records listed on the perpendicular scale at the left 


























































































































Series A, 368 judges Series B, 371 judges 
No. No. 1 | No. 2 | No. 3 | No. 4 No. No. 5 | No. 6 | No. 7 | No. 8 
1 imi 39 145 255 5 ies 275 146 186 
2 329 nes 315 349 6 96 — 90 132 
3 223 53 ee 277 7 225 281 poe 232 
4 113 19 91 8 185 239 139 
Series C, 353 judges Series D, 346 judges 
No. No. 9 | No. 1 | No. 10 | No. 11 No. No. 12 | No. 13| No. 4/ No. 14 
9 éma 251 197 229 12 as 95 244 107 
1 102 -,: 121 192 13 251 Ree 301 140 
10 156 232 ie 212 4 102 45 ane 53 
11 124 161 141 14 239 206 293 
Series E, 306 judges Series F, 320 judges 
No. No. 15 | No. 16 | No. 17| No. 3 No. No. 2 | No. 18| No. 19 | No. 20 
15 tank 145 100 149 2 oes 301 279 220 
16 161 — 109 173 18 19 cas 82 48 
17 206 197 nai 193 19 41 238 cas 86 
3 157 133 113 20 100 272 234 
Series G, 318 judges Series H, 206 judges 
No. No. 5 | No. 4 | No. 3 | No. 2 No. No. 21| No. 2 | No. 22/| No. 13 
5 on 270 227 93 21 =_— 134 3 180 
4 48 ape 96 33 2 72 cae 1 163 
3 91 222 one 45 22 203 205 - 205 
2 225 285 273 ‘ 13 26 43 1 
































Note: Under Series A this table reads as follows: 
record number 2 by 329 judges, better than number 3 by 223 judges, and better than number 4 by 





113 judges, etc.” 


“Record number 1 was judged better than 





























- 
~~ <= 
SF EE 


Rew 


— 











eo ee ee 


SR MEOT SERS 


te <a s il el 


Ce Srey 








———— 


558 


The Journal of Educational Psychology 


Taste VII.—Ratines or PHonoGraPH Recorps By Expert JupGEs 
















































































































































































Series A, 18 judges Series B, 18 judges 
No. No. 1 | No. 2 | No. 3 | No. 4 No. No. 5 | No. 6 | No. 7 | No. 8 
1 ee 1 3 13 5 ws 16 1l 13 
2 17 Ke 14 18 6 2 a6 5 3 
3 15 4 - 17 7 7 13 sa ll 
4 5 0 1 ae4 8 5 15 ; 
Series C, 18 judges Series D, 18 judges 
No. No. 9 | No. 1 | No. 10/ No. 11 No. No. 12 | No. 13| No. 4 | No. 14 
9 ie 16 12 18 12 a 3 14 3 
1 2 ‘ 5 13 13 15 on 18 7 
10 6 13 a? 18 4 4 0 se 2 
ll 0 5 0 a 14 15 1l 16 
Series E, 18 judges | Series F, 32 judges 
No. No. 15 | No. 16 | No. 17| No. 3 No. No. 2 | No. 18 | No. 19/| No. 20 
15 sie 8 6 6 2 ve 31 31 21 
16 10 ie 7 10 18 1 i 17 3 
17 12 11 aa ll 19 1 15 oni 1 
3 12 8 7 ot 20 ll 29 31 
Series G, 32 judges Series H, 23 judges 
No. No. 5 | No. 4} No. 3 | No. 2 No. No. 21 | No. 2 | No. 22! No. 13 
5 - 30 17 6 21 21 0 16 
4 2 ~_ 1 0 2 2 7 0 15 
3 15 31 set 7 22 23 23 me 23 
2 26 32 25 13 7 8 0 
Series I, 14 judges 
No. No. 3 | No. 15| No. 18| No. 4 
-_ 12 14 
15 2 a ll 
18 5 g 12 
0 3 


























Scales for Measuring Judgment of Music 559 


Taste VIII.—CatcuLaTep VALUES AND FREQUENCY oF Use or Eacu REcorD 
































‘- PE value from number 12 Number of times played 
Record 
number ~ Non- Mean In In In In 
I experts Alpha Beta | Gamma} Delta 
1 + .36] + .21] + .28 1 1 
r 2 —2.42 | —1.55 | —1.98 1 1 1 1 
/ 3 —1.05 | — .06| — .55 2 1 1 
‘ 4 +1.65 | + .76 | +1.20 1 1 1 1 
. 5 — .04| — .87| — .90 * 1 1 1 
6 4+ 86/— .10/| + .387 
7 — .30) —1.21 | — .80 
- 8 — .21| — .75| — .48 1 1 1 1 
az 9 —1.85 | — .50| —1.17 1 2 1 2 
10 —1.30 | — .20| — .79 2 1 1 1 
11 4+2.42 | + .17 | +1.29 
9 12 0 0 0 1 1 2 1 
13 —2.04 | — .73 | —1.38 2 om 1 1 
14 —1.55 | — .84] —1.19 
15 — .68/| — .02| — .35 1 2 2 
_ 16 —1.04| — .18] — .61 1 1 1 
- 17 —1.41 | — .66 | —1.03 1 1 1 1 
18 + .25| + .85| + .55 2 1 1 1 
19 + .51 |] + .05| + .28 1 1 an 1 
3 20 —1.92 | — .79 | —1.35 1 7 1 1 
21 —3.27 | —2.11 | —2.60 
22 —7.17 | —5.06 | —6.11 


























In general it will be observed that the expert judges rate the better 
records higher and the poorer records lower than they are rated by the 
non-expert judges. There are peculiar qualities about records num- 
bers 6, 7, 8, and 11, however, which cause the non-expert judges to 
rate them much lower than they are rated by the experts. Perhaps it 
is the failure of the non-expert judges to see beyond the obviously 
unusual combinations of instruments and phrases. On the other hand, 





=e 
-_——— 











——_ — 


— © biked 
ee wee ne re ee Ree ee 
= -~>+ ~— 
vo 


























PREAOE MOe eR eR + 


ey 








2 

¥ 
f . 
+ " 
aE 
f 
i. 
ne 
- 
- 
: 
f 
‘ 

: 


ES ee a oe CCL OE LT DUNE OS 


a a eg ee Te Ee, 





560 The Journal of Educational Psychology 


the non-experts very greatly over-estimate the relative quality of 
record number 18. 

In the arrangement of the final tests, record number 14 was not 
used, for both the matrix and the musical score from which it might 
be reproduced have been destroyed. Records numbers 6, 7, and 11 
were not used because of the extreme differences between their ratings 
by the two types of judges, and numbers 21 and 22 were omitted 
because their values are so low as to cause a serious question in many 
minds as to whether they have any musical merit whatever. 


















ae” = “ID, O +10 +2.0 
2a x 9 es" 48 | 
WNon- expert \ 
| Ratings | 
4 
i Mean 














-2.0 -1.0 0 +1.0 +2.0 
Relative Merits of Phonograph Records 
Fig. 3. 


Although record number 8 is under-estimated by ordinary judges, 
it was used in the final scales. Care was taken, however, to see that 
the lines of its associates in any group in which it appears do not in 
Figure 3 cross its own line. In Group B of Scale Delta, for example, 
it is associated with number 4 and number 19, both of which have lines 
running almost parallel to its line. The general rule was followed in 
the selection of all groups that no two records should be played in the 
same group if their lines in Figure 3 crossed each other, and that as 
far as possible all lines in a single group should be approximately 
parallel. 

Group A in each scale presents selections which have the largest 
amounts of difference between their general values. Group B presents 
selections having somewhat smaller differences in their values, which 
will naturally result in a larger number of errors when people attempt 


sa OoOmp GU SB wat es 


——§ © © = ot me De ODO WD 


~~ 


a a ee a eee ee | 


o— teids An HH 








OEE a ODD Ee eee 


Scales for Measuring Judgment of Music 561 


to rate their relative merits. Groups D and E present records which 
have such small differences in their general values that only well 
trained listeners may be expected to distinguish correctly among them. 
The increasing difficulty from first to last makes the test a practical 
measure of how small a difference can be detected. No selection is 
played more than twice in any scale, and so far as possible each is 
played but once. In only one case where a record is played more than 
once does the second playing come in the next group after the first 
playing. 

Reliability.—Unfortunately the tests do not seem to be as reliable 
as one would desire, although they are probably more reliable than any 
estimates of ability the average teacher of music would be able to make. 
In one of the writer’s classes at the University of North Carolina, 44 
sophomores were tested first with Scale Gamma and then with Scale 
Delta. The coefficient of correlation (Pearson formula) between the 
two sets of results was .508, which is low when one considers the time 
required to administer the tests, although fairly satisfactory if one 
considers the fact that there are only 15 or 16 elements to be rated in 
each scale, and that the tests measure emotional rather more than 
intellectual reactions. The only other check yet obtained on the 
reliability of the tests is a coefficient of correlation of .519 between 
Alpha and Beta in a class of 21 high school students. 

Apparently there is little relationship between the ability measured 
by these scales and general academic ability. In the group of sopho- 
mores at the University of North Carolina the lowest score in judging 
music was made by the boy who had next to the highest score in the 
Miller Ability Test. The coefficient of correlation between the musical 
judgment scores and the mental ability scores for the 39 students who 
took both tests was only .18, although the quadrant of the correlation 
table indicating low academic ability and high musical judgment 
contained but one individual. Perhaps one must have at least a 
sufficient amount of academic ability to make 70 points on the Miller 
Test before he can learn to judge orchestral music successfully. It 
seems certain, however, that high general academic ability does not at 
all imply high ability to judge orchestral music. Musical appreciation 
is apparently a highly specialized trait, although it seems to 
be remarkably susceptible to training. 


ls 






































= i ee -. 
EE 














eee dead a 
4 ee 


Eo: 


| 
4 
{ 








; 
, 
. 
; 








NOTES ON ARTICLES IN EDUCATIONAL 
PSYCHOLOGY IN CURRENT ISSUES OF 


aet~ OTHER MAGAZINES -_— ay 











REPORTED BY CECILE COLLOTON 


Department of Educational Psychology, The Lincoln School of Teachers College 
INTELLIGENCE TESTS 


What is the IQ? A. H. Martin. The Australasian Journal of Psychology and 
Philosophy, 1923, September, 174-176. An explanation in simple language of the 
meaning of the term intelligence quotient. 

Educational Implications of the IQ. John Adams. The Australasian Journal 
of Psychology and Philosophy, 1923, September, 177-190. Emphasizes the need 
for an objective standard in education and discusses the arguments for and against 
the use of the IQ as such a standard. 

A Program Arrangement for Menial Groups. Lee C. Rasey. The School 
Review, 1923, October, 608-611. How an x, y, 2, classification of high school 
students works in practice. The school has 1050 pupils and the plan has been used 
for two years. Grouping is made on the basis of intelligence tests supplemented by 
records of actual accomplishment and teachers’ estimates. © 

The Sectioning of High School Classes on the Basis of Intelligence. Gustave A. 
Feingold. Educational Administration and Supervision, 1923, October, 399-415. 
An experiment in sectioning high school freshmen on the basis of ability proves 
advantageous to both pupils and teachers. Seven tables give interesting statistical 
data. 

Berlin Schools for Gifted Children. Adolph E. Meyer. The Pedagogical 
Seminary, 1923, September, 205-210. Describes the intelligence test used for the 
selection of gifted children and gives some information as to progress of children 
in the special school. 

Vocational Tests for Agriculvural Engineers. H. E. Burtt and F. W. Ives. 
Journal of Applied Psychology, 1923, June, 178-187. Describes a mental test 
designed to predict special ability in agricultural engineering. 

Some Experiments with Mental Alertness Tests at Northwestern University. 
Paul L. Palmer. School and Society, 1923, Nov. 3, 536-540. The results of a 15 
minute test required of all Liberal Arts freshmen at Northwestern University are 
discussed in detail. The test score is particularly valuable in predicting the success 
of the first and fifth quintiles. 

Psychological Tests versus the First Semester’s Grades as a Means of Academic 
Prediction. John L. Ernst. School and Society, 1923, Oct. 6, 419-420. Correla- 
tions between scores on the Army Alpha and grades of the entire college course, 
and between grades for the first semester and the entire course show the Alpha 
Test to be a better means of predicting a student’s ability than his first semester 
grades. 

Diagnosis of the Unstable Moron. George Ordahl. The Journal of Delin- 
quency, 1923, March, 99-112. Proposes classifying responses on intelligence tests 


562 


sit 





Articles on Educational Psychology 563 


according to the relevancy of reactions to determine stability. Five individual 
cases furnish illustration. 

The Relation between the Intelligence and Vocational Choices of High School 
Pupils. Gustave A. Feingold. Journal of Applied Psychology, 1923, June, 143- 
153. A comparison of the vocational choices of 512 high school freshmen tested 
with a modified Army Alpha, and Fryer’s Occupational—Intelligence Scale shows 
that only 46 per cent make proper vocational choices; 47 per cent choose vocations 
beyond their mental reach; and 7 per cent underrate their ability. 

A Study of the Correlation of College Students’ Estimates of Intelligence with the 
Otis Tests and Other Scales. J.U.Yarbough. Journal of Applied Psychology, 1923, 
June, 157-167. Thirty college students rate each other in intelligence on a scale 
devised by themselves. Their ratings are compared with the instructor’s estimates, 
college grades, and scores on Otis group test. 

Intelligence Scores of Colored Pupils in High Schools. E.L. Thorndike. School 
and Society, 1923, Nov. 10, 569-570. A comparison of the scores of negro and 
white children on two intelligence tests shows the decided superiority of the white 
pupils. Only 4 per cent of the negroes reach the median of the white children. 

Intelligence and Literature. Ross W. Rohn and Thomas H. Briggs. School and 
Society, 1923, Oct. 27, 508-510. Compares the reading, voluntary and required, 
of 277 high school pupils and their intelligence as measured by the Terman Group 
Test of Mental Ability. Forty seven superior and 54 inferior pupils are studied 
in detail. 

EpvUcaTIONAL TESTS 


Tests un History and the Social Studies. The Historical Outlook, 1923, 
November. Practically the whole issue of the magazine is given over to a discus- 
sion of tests in the following articles: 

Improving the teaching of history through the use of tests. Bertha Elston. 
List of History Tests. 

Written Examinations and their Improvement. Prof. W. 8. Monroe. 
Examples of History Tests. 

New Tests for Old. Richard H. Shryock. 

New Types of History Tests. F. E. Moyer. 

Evaluating the Aims and Outcomes of History. Earle Rugg. 

. New Kinds of Tests in Social Science. Ruth E. Hardy. 

Some Investigations Concerning the Use of Certain Home Economics Information 
Tests. Anna M. Cooley and Grace Reeves. Describes three Home Economics 
Tests dealing with (1) studies pertaining to clothing, (2) studies pertaining to food, 
(3) other household activities. Findings based on 1065 tests in 40 different schools 
in 30 towns or cities of the United States are discussed. 

A Dictionary Test. Thomas H. Briggs. Teachers College Record, 1923, 
September, 355-365. A diagnostic test to determine how much instruction 
children need in the use of the dictionary. Notstandardized. Based on Webster’s 
Standard School Dictionary. 


ONS ho» 


Cass Srupies 
The Psychological Clinic, 1923, March, April 


1. Patsy. Margaret C. Brooke, 41-43. The boy who found a needed friend in 
the clinician. 


ent 
oe 

















ee = _s 
— nn or - 


























_.  —— .= >. © 


I ee i ee 


—- 


ee ne - Om o 
—E 


om eae Spt ca ig 





ht 


pe ae eS 


2 PRE OO DE 
anya Re nee ani ee 


564 The Journal of Educational Psychology 


2. Allison and His Parents. Bernice Leland, 44-47. 

The parents’ responsibility for a child’s abnormal reactions and attitude. 

3. Maurice. Catherine Riggs, 52-55. Diagnostic teaching. 

4. Gladys. Beatrice M. McCully, 56-58. Diagnostic teaching. 

5. Albert. Helen W. Brown, 59-60. Diagnostic teaching. 

The Superior Child. Alice M. Jones. The Psychological Clinic, 1923, March- 
April, 1-8. The first of a series of case studies. Describes four children with 
exceptionally high IQ’s showing the great range of individual differences. 

The School Psychologist. R. B. W. Hutt. The Psychological Clinic, 1923, 
March-April, 48-51. Individual cases show what the psychologist can do to help 
‘‘failures’’ in school. 


MISCELLANEOUS 


A Plan of Organization for Taking Care of Bright Pupils. W.C. French. The 
Elementary School Journal, 1923, October, 103-108. Describes special courses 
given in a small school system by the regular teachers to ‘‘enrich the curriculum” 
for the superior child. 

What Can the Secondary School Do for the Student of Low IQ? Margaret M. 
Alltucker. The School Review, 1923, November, 653-661. Suggestions for 
modifying curriculum and instruction to meet the needs of the child of limited 
mental capacity and make him an efficient member of society. 

The Relation between Physical and Mental Development. Mary L. Dougherty. 
The Elementary School Journal, 1923, October, 130-134. A study of twins—girl 
and boy—with reference to mental and physical development and personal 
characteristics. 

The Subnormal Child. Walter E. Furnald. School and Society, 1923, Oct. 6, 
397-406. Paper read at Harvard Teachers Association. Proposes a program for 
the salvaging of the subnormal child in the public schools. 

Personnel Work at Its Source. Ruth Swan Clark. School and Society, 1923, 


A Oct. 27, 487-491. Educational and vocational guidance in New York City 


schools. The use of tests in determining vocational work. 

How an Instructional Research Department Can Assist Teachers. P.T. Rankin. 
Journal of Educational Research, 1923, October, 187-198. To make for greater 
efficiency in the learning process, testing in the school subjects should be carried on 
by the teacher with the help of the research department in the selection and the 
interpretation of the tests. 

Improvement in Rating the Intelligence of Pupils. G. F. Varner. Journal of 
Educational Research, 1923, October, 220-232. Proposes a scale for rating intelli- 
gence which takes into account five factors usually tending to make teachers’ ratings 
unreliable. Describes the use of the scale in two school systems. 

A Rating Scale for Individual Capacities, Attitudes and Interests. W. Hardin 
Hughes. The Journal of Educational Method, 1923, October, 56. Describes 
a scale used in the Pasadena junior and senior high schools. Fifty students are 
rated at one time on 12 character traits and 10 special interests. Each group of 50 
is divided by a ‘‘man-to-man”’ comparison into five groups—from 2 to 5 being 
placed in the lowest and highest groups, 10-15 in the inferior and superior groups 
and the remainder in the average group. 

Comparative Social Traits of Various Races. Second Study. C. B. Davenport 
and Laura C. Crayton. Journal of Applied Psychology, 1923, June, 127-34. 


rec 
me 


exe 


in\ 


tio 


th: 
tie 


vo 





Articles on Educational Psychology 565 


One hundred and eighty-eight sets of judgments on 102 high school students give 
tentative conclusions as to the differences in 10 social traits among Germans, Irish, 
It alians, Austrians and Russians. Three tables present the statistical data. 

A Study of a Small Group of Irish-American Children. Rebecca E. Leaming. 
The Psychological Clinic, 1923, March-April, 18-40. Data from 110 cases show 
definite racial differences and characteristics in test results. Seventeen cases are 
described in detail. 

A Study of 1000 Children Who Do Not Conform to School Routine. Selina 
McCaulley. The Psychological Clinic, 1923, March-April, 9-17. Emphasizes 
the need for a specialized curriculum for the backward child. 

Delinquents and Non-delinquents on the Downey Will-Temperament Test. Edythe 
K. Bryant. The Journal of Delinquency, 1923, January, 46-64. Reports the use 
of the Will-Profile Test with 420 normal boys. Comparisons are made with 100 
tests of delinquent boys described in a previous article. 

The Diagnostic Value of Individual Record Cards. Clay Campbell Ross. 
Educational Administration and Supervision, 1923, October, 439-444. Aims at 
the prediction of success in high school through a study of elementary school 
records by the method of partial correlations. Spelling record the best single 
measure of fitness for high school according to data from 42 cases. 

An Analysis of Multiplication Drill. F. B. Knight. Journal of Educational 
Research, 1923, October, 199-207. A detailed analysis of two sets of practice 
exercises in multiplication shows important psychological differences with reference 
to the laws of learning. 

Syllabification as a Factor in Learning to Spell. Harry A. Greene. Journal of 
Educational Research, 1923, October, 208-219. Summarizes three experimental 
investigations of the problem of syllabication and reports a fourth in detail. §@ 

The Effect of Locality on Language Errors. Dagny Sunne. Journal of Educa- 
tional Research, 1923, October, 239-251. A Study of the written work of 8618 
children in Louisiana Schools. Results compared with the Charters report show 
that in general syntactical errors hold relatively the same order in different locali- 
ties, but that many language errors are peculiar to the community. 

A Method for Measuring the “Vocabulary Burden” of Textbooks. Bertha A. 
Lively and 8S. L. Pressey. Educational Administration and Supervision, 1923, 
October, 389-398. The vocabulary difficulty of 15 books and one newspaper is 
evaluated by a study of thousand-word samplings with reference to the range of 
vocabulary, number of zero value words, and weighted median index number. 

The Test-study Method versus the Study-test Method in Spelling. John H. 
Kingsley. The Elementary School Journal, 1923. October, 126-129. An 
analysis of the spelling records of Grades V to VIII over a period of two years 
shows the great superiority of the test-study method. 

The Promotion Pian in the Horace Mann Elementary School and Kindergarten. 
Clara Chassell: Educational Administration and Supervision, 1923, October, 
445-447. Describes the various tests and measures used in the school and the 
method of determining composite scores. 

A Study of Emotional Stability in Children. Ellen Matthews. The Journal of 
Delinquency, 1923, January, 1-40. Reports the result of the use of a modification 
of the Woodworth personal data sheet with an unselected group of 1133 children 
and 436 selected children. The effect of sex, race, age, intelligence, etc. is studied. 
Complete statistical data are given. 























sb xo Pipalladarasing te” AEs . 


566 The Journal of Educational Psychology 


The Spelling of Homonyms: An Experimental Investigation of Teaching Them. 
E. O. Finkenbinder. The Pedagogical Seminary, 1923, September, 241-251. Con- 
cludes from the experimental data that the ‘“‘separate method’’ is more effective. 

A Two-year Experiment with Vocational Guidance in a Woman’s College. Iva 
Lowther Peters. The Pedagogical Seminary, 1923, September, 225-240. Illus- 
trates what can be done by a vocational bureau working in close cooperation with 
the administrative and academic departments. 

Latin as a Preparation jor French. Thomas J. Kirby. School and Society, 
1923, Nov. 10, 563-569. Data secured from 268 freshmen at the State University 
of Iowa show a positive though low correlation between years of Latin studied in 
high school and marks made in first and second semester French in the University. 


New TEsts 


Ruch-Popenoe General Science Test. Giles M. Ruch and Herbert F. Popenoe. 
A test of achievement in general science for grades 7 to 9. Can be given in a 45 
minute period. Two alternative forms. Price per package of 25 examination 
booklets including Manual of Directions, 1 Key, 1 Percentile Graph, and 1 Class 
Record, $1.50 net. Specimen set .25. Published by World Book Co., Yonkers- 
on-Hudson, New York. 

Otis Classification Test. Arthur S. Otis. A combined mental ability and 
educational achievement test for use in the regrading and classifying of children in 
Grades IV-VIII. Requires 60 minutes actual working time. Scoring is very 
simple. Two alternative forms. Price per package of 25 booklets, 1 Key, 1 
Interpretation Chart and Percentile Graph and 1 Class Record, $1.30. Manual 
of Directions .25. Specimen set .35. Published by World Book Co., Yonkers- 
on-Hudson, N. Y 

Morrison-McCall Spelling Scale. J. Cayce Morrison and William A. McCall. 
Eight lists of 50 words each for testing spelling ability of pupils in grades 2 to 8. 
Directions for giving the test, scoring paper, and interpreting results are contained 
in the same pamphlet with the word lists, illustrative sentences, and tables of age 
norms and T. scores. Published by World Book Co., Yonkers-on-Hudson, N. Y. 

Lewis English Composition Scales. Erwin Eugene Lewis. The first tests 
designed for measuring business and social correspondence. Separate scales for 
measuring order letters, letters of application, narrative social letters, expository 
social letters, and simple narration. Complete directions are printed in the book- 
let. Price .25. Published by World Book Co., Yonkers-on-Hudson, N. Y. 

Van Wagenen English Composition Scales. M. J. Van Wagenen. Separate 
scales for exposition, narration, and description. Thought content, structure, and 
mechanics are rated separately and averaged for a general merit score. Complete 
directions are provided in the booklet. Price .25 net. Published by the World 
Book Co., Yonkers-on-Hudson, N. Y. 

The Lohr-Latshaw Latin Form Test for High Schools. Harry Franklin Latshaw. 
Published as Number 1 of the “Studies of Education” by the Bureau of Educational 
Research, University of North Carolina, Chapel Hill, N. C. 

Blackstone Stenographic Proficiency Tests. E. G. Blackstone. Typewriting 
tests in five alternative forms. $1.00 per package of 25 including a manual of 
Directions, a Percentile Graph and Record Sheet. Specimen sets, .25. Tests in 
note taking and transcribing are now being prepared. Published by World Book 
Co., Yonkers-on-Hudson, New York or 2125 Prairie Avenue, Chicago. 








th 
tic 








DE ee el '. ee we 


- OO mm © + 


eo =u @ 


~— (FF me CY 





NEW PUBLICATIONS IN EDUCATIONAL 
PSYCHOLOGY AND RELATED FIELDS OF 


oe EDUCATION im 


DEPARTMENT IN CHARGE OF LAURA ZIRBES! 


1. Two Introductions to Psychology.—In few college courses does 
the content of the course differ so widely from institution to institu- 
tion as in the subject of psychology. What is taught under that name 
in one school is anathema in another. And this is probably more 
true today than it was a generation ago, when the psychologists of this 
country were more or less loyal to the leadership of James. Although 
the present state of affairs may be deplored by some psychologists, it is 
regarded by others as indicative of the healthy and vigorous growth of 
this new and young science. The lusty infant will not stay put. It 
refuses to be limited and confined to any one particular path. 

This great diversity of opinion as to what should be the content of a 
course in Psychology is well emphasized in two introductions to psy- 
chology recently published. One is by Seashore? a name well known 
to all students of psychology, and the other is by Griffith,* one of the 
younger workers in the field. Each of the authors in the preface says 
that he is trying to make the subject vital for the student. Seashore 
wants the student to “psychologize;” he wants psychology to function 
in the life of the student. Griffith would show the student how vitally 
psychology is concerned with the business of living in all its various 
ramifications. And so each writer sets forth upon his way, and the two 
ways seldom, if ever, meet. 

Seashore devotes about a hundred pages to sensation; Griffith about 
eight. Seashore treats in orthodox fashion of perception, attention, 
association, memory, thought; Griffith has no special treatment of these 
topics as such. Abnormal psychology, social psychology, industrial 
psychology are all treated at length by Griffith. Only as an after- 
thought does Seashore make room at the end of his book for a chapter 














1 All unsigned reviews were prepared by Laura Zirbes. 
? Seashore, C. E.: ‘‘Introduction to Psychology.’’ New York Macmillan Co., 
1923, pp. XVIII + 427. 
* Griffith, C. R.: ‘General Introduction to Psychology.”” New York, Mac- 
millan Co., 1923, pp. XV + 513. 
567 











ae oo 
- - ean 











Sg a OI a 








SOS ESS ~~ | - 


eee ne 


; 
; 
i, 
t 
3 
; 
if 
} 
ri 
; 











= = R ¥ a , ee eae oe 
2 Sak Saale MRR ay PIE YO OE A EEE Saree pe 5 SARL de pat =. 
re ? Z "3 4 cae ct > a. = es an > 
Neale * StE ew ime ee ee ae = -_— =o : rk? tay : ¥ 
. : 


568 The Journal of Educational Psychology 


on dreams and a brief chapter on individual psychology. They do 
not seem to belong in the logical system he has built up. 

It would seem to me that both books have their merits and their 
drawbacks as introductions to psychology. A student who follows 
Seashore may conceivably get a respect for psychology as a well- 
ordered science, somewhat dry and formal, and very much removed 
from daily life. If he continues his study, this rigid training will 
probably be of value to him. To the great majority of students who 
only take one course in psychology, the impression will be left on them 
that psychology has a lot to say about the abstruse facts of mental 
life but relatively little about the business of living. It would seem 
difficult to justify the teaching of innumerable facts about sensations 
to the ordinary student. Of what pedagogical value is accommoda- 
tion, convergence, retinal image, external projection, and the like? 
And so fact after fact is marshalled before the reader, with only the 
relief of the “‘exercises’’ scattered throughout the book. 

Now the average student comes to his course in psychology with a 
very vague notion of what psychology is all about. He has his own 
notions, generally very fantastic. But he does think that psychology 
will tell him something about hypnotism, or character-reading, or 
phrenology, or will-power; that it will teach him how to think correctly 
or develop more personality; that it will explain insanity and tell him 
whether animals reason or not. All these and many other things are 
treated in a delightful fashion by Griffith. I can imagine the student 
skipping ‘‘Part’’ One of the text, which deals with a discussion of 
structuralism, functionalism, behaviorism and similar “isms,” and 
becoming completely absorbed in the rest of the book, which deals with 
genetic, social, abnormal and applied psychology. There are few 
books that would answer so well the many questions which the first-year 
student in psychology raises, as this text by Griffith. Naturally the 
treatment of any one field is at times sketchy and the specialist in any 
one of the numerous fields will be dissatisfied with the treatment in his 
particular field, but, considering the purpose of the author, on the whole, 
the book is excellently planned and executed. 

To the student of education, neither book makes any decided 
appeal. R.-P. 





2. Educational Progress.—In his book entitled ‘‘ Progressive Educa- 
tion,’’! Professor Mirick first presents an outline of the difference 


1 Mirick, George: ‘‘Progressive Education.” New York, Houghton Mifflin 
Co., 1923, pp. X + 314. 





~—™4OoiwWwedabuo mk 


nw 
=. 


(yo 








— he § RD MS 


ON 1 Qe me OG 


~ &» 


a 


RZaMS 20 SS SS wm ee eh ee DO eR OS 


Ss? 
~ 


New Publications 569 


between the philosophical basis of considering educational topics and 
the modern factual basis. No one can justly claim that the scientific 
method of studying educational problems is as yet firmly established. 
Its merits are so rapidly becoming recognized that even though a good 
many persons are not fully convinced regarding scientific study, they 
do appreciate that it is not quite respectable to fight against the scienti- 
fic study of education. The author argues that although scientific 
study tends constantly to produce change, and philosophic study 
tends to keep things as they are, the two points of view should be used 
to balance one another in considering educational topics. Scientific 
thinking ‘“‘tends to instability and restlessness.’’ ‘‘ Without philo- 
sophic thinking there would have been no customs, no institutions, no 
organized society, nothing stable on which mankind could stand while 
preparation was being made for the next advance. In advocacy of a 
scientific point of view as well as one that is philosophic, it seems that 
the author exerts himself over-much to produce a philosophic analysis 
of modern scientific study as relates to education. Though his 
discussions are often not easy to follow, his conclusions given in the 
summaries are clear and thought-provoking. For example, “All 
education is self-education. ”’ 

‘All education has to do primarily with impulses and desires 
rather than with presentation of material.”’ | 

“Progressive education is just real education. Science confirms 
and improves the educational policies that ‘natural’ teachers have 
always followed.” 

“Much of human contact can be understood only as it is seen from 
the biological point of view.” 

Throughout, the book embodies current educational thought and 
literature, citing authorities regularly and abundantly. Indeed, the 
book will be found most useful as a summary of the modern educational 
philosophy which has been presented in books and magazines. Almost 
no effort is made to include a presentation of data from experimental 
work in the improvement of practice. If specific experimentation is 
a determining factor in educational progress, this is a significant lack. 
There are now something like fifty schools in America which may prop- 
erly|be called experimental in their efforts to help education to progress. 
Books including such titles as ‘‘Progressive Education” and the 
“School Curriculum” would be helped if they would present the pro- 
grams and procedures of a score or more of experimental schools which 


ee ot ge wert wee, = - 
a 


0 
eo eee 








Sr wt 


oe oe 


H 
Vt 
i 
| 
i 
* 
* 











~ 


~~ 


x + 


ee ee ee Sl 
Sere a ie . ‘ ilies 
- \ ~~ one 


x 

4 

: 

at 
} 

! : 
. 


570 The Journal of Educational Psychology 


are trying, with acknowledged shortcomings, to apply the scientific 
method in developing improvements in education. 


Otis W. CALDWELL. 





3. Outlines of Coursesin Psychology.—The printed outline of a course 
to be used by the student is a teaching device evidently growing in 
popularity. Professor Gitford presents two such outlines, one for 
general psychology,' and the other for educational psychology.? 
Both of these outlines are very well done, and will doubtless be of 
great help to the author’s students. They may interest other teachers 
of psychology in connection with the planning and arrangement of 
their courses. The reviewer does not believe that any teacher of 
psychology should use someone else’s outline in actual classroom work. 


4. A Child Study Manual for Parents and Teachers.*—This is a well 
organized outline of 51 topics, each of which deals with some phase of 
child life, from infancy through adolescence. Each section or chapter 
is introduced by a two or three page statement which gives a pre-view 
or point of departure for reading, study and discussion. Then follows 
a brief summary in outline form and a rather comprehensive list of 
specific references. The “popular” sources are listed first so that the 
group discussion may start with the more general presentations and 
continue the study through the ‘‘non-technical’”’ readings to the 
“technical” treatment of specific phases of the subject. 

This outline will be of great value to a very small proportion of all 
parents who are interested in child study. It presupposes leisure time 
for study, training sufficient to read and understand the wording of the 
text and sources quoted, and ability to organize material thus secured 
and interpret and apply it to particular problems. But the average 
parent, comprising a much larger proportion, is not capable of using 
such an outline. It is not intended for such, but may a plea be pre- 


1 Gifford, W. H.: ‘“‘Introduction to Psychology, A Syllabus.” Harrisonburg, 
Va., Garrison Press, 1923, pp. 35. 

2 Gifford, W. H.: ‘‘Introduction to the Learning Process, A Syllabus in Educa- 
tional Psychology.” Harrisonburg, Va., Garrison Press, 1923, pp. 34. 

?Gruenberg, Benjamin G.: “Outlines of Child Study.” Edited by B. C. 
Gruenberg for the Federation for Child Study.”” New York, Macmillan Co., 1922, 
pp. XX + 260. 





New Publications 571 


sented here for another outline of child study? The first requirement 
should be that the material be organized by those who, like the editor 
of this outline, really know the subject-matter. Second, the material 
should be presented in language that the average parent can grasp. 
Third, it should require a minimal amount of study and search for 
and through books which are often inaccessible and uninspiring to the 
man or woman tired out by the day’s work. Fourth, it should be 
adapted to the problems of the average home and vitalized by the 
inclusion of numerous simple concrete episodes such as are written by 
Angelo Patri in the New York Evening Post. That parents are 
becoming more and more interested in child study is evidenced by the 
numerous articles in newspapers and magazines and the appearance 
of whole sets of books which claim to solve all the problems of child 
training. There is a great need for studies and study outlines written 
by those who know, and in language which does not make the average 
parent feel that the discussion is too theoretical, technical and remote 
from his actual problems. 

Persons charged with the responsibility of organizing study groups 
and leading discussions of problems of child study will meanwhile be 
glad to avail themselves of this outline and will find the references 
well selected and carefully organized. 

BERTHA MILLER Ruvae. 





5. A Scientifically Determined Course in Handwriting.—Here we 
have an organized sequence of aims and exercises based on clearly 
defined psychological and pedagogical principles which govern learning 
and writing. In the four concise chapters the experimental and 
investigational basis for a course of writing lessons for the six grades of 
the elementary school is discussed. In view of their derivation the 
lucid statements of aims and standards dispose one favorably to the 
detailed prescription of daily procedure, which might otherwise seem 
arbitrary and over didactic. The content of the exercises usually 
has double sanction. (1) It is selected to fill particular handwriting 
needs. (2) It shows a realization of the instrumental function of hand- 
writing in relation to other school subjects and activities. 

This handbook is another milestone along the path offprogress which 
leads past empirical commercial systems and procedures toward scien- 
tifically determined educational standards, methods and materials. 





1 Freeman, Frank N. and Doughtery, Mary L.: ‘“‘How to Teach Handwriting.” 
New York, Houghton Mifflin Co., 1923, pp. V + 305. 














Ee ell eee ee 
were — = aH FS I 





572 The Journal of Educational Psychology 


6. Socialized English—A dozen verbatim reports of socialized 
recitations in the varied aspects of eighth-grade English! will prove 
not only enlivening to any teacher of English but also very practically 
suggestive. If one is already committed to this type of recitation, 
he will, no doubt, react quickly to the fervor of an editorial introduction 
by James E. McDade, who says, ‘“‘The recitation soars far above the 
old level of monotonous routine, and becomes a game, a romance, an 
adventure even.” The reports themselves, unlike many others of 
similar import, do not conceal the fact that a very capable teacher in 
the offing somehow sees to it that the romantic flight above the old 
level starts and finishes on the ground. 

Practical Assistance for Teachers in Service.—The practice of stating 
the exact purpose of a book in the author’s preface considerably simpli- 
fies the task of a reviewer. It furnishes him with a sort of yard stick 
by which to judge the work. In the preface of the book under 
consideration,? the author says, “The justification for this, another 
treatise on the elementary course of study, is the fact that compara- 
tively a very small number of the teachers in the elementary schools 
are so situated as to have immediately the results which are continu- 
ally being discovered in the new aspects of educational theory and 
practice. The author offers no excuse for playing the role of codifier 
for these newer findings in education. For he regards this as a distinct 
sort of contribution necessary for educational progress.”’ 

The book, then, is a resumé, not an exhaustive treatise of any one 
aspect of the problem of curriculum building. Some general princi- 
ples of elementary education are discussed in the first 35 pages. 
Each subject of the curriculum is then considered separately; the 
subject-matter for each grade, and some discussion of method and the 
measurement of results are included. There is nothing epoch making 
about this book; if the potential readers could come into direct contact 
with the subject-matter which it seeks to codify, rather than with the 
codification, more satisfactory results in the change of teaching method 
might be expected. But the average teacher in service will find here 
considerable inspiration and practical help for the daily routine of 
classroom teaching. Epwin H. REEDER. 


Teachers College, N.Y.C. 


1 Rusch, Louise C.: ‘The Socialized Recitation in English.” Modern Educa- 
tion Series. Chicago, The Plymouth Press, 1923, pp. 88. 

2 Phillips, Claude A.: ‘‘Modern Methods and The Elementary Curriculum.”’ 
New York, The Century Co., 1923, pp. XIII + 389. 





appa 


the | 
chan 























No, 27147. 


Puzzle-boz, eemnan, Comes studying learning of the problem-solving type, and 
comparing the naive with the instructed method. This puzzle-box is of substantial con- 
struction and will stand hard service. . . . ....++es -$56.25 Net 





.C B_STOELTING CO. 


No. 31215. 


Mirror Drawing and Hand Tracing Apparatus, Freeman’s. The purpose of this piece of 
apparatus is to provide means for a practical illustration of the development of a motor- 
coordination. e apparatus introduces a modification of the usual relationship between 
the hand movements and the resulting pen movements, and, in addition, an apparent 
change in the direction of the pen movements by means of a mirror. . . .$36.00 Net 


c.H. STOELTING co. 


Manufacturers—Importers—Exporters 
Psychological and Physiological 


Apparatus and Supplies 
3037-3047 Carroll Ave. CHICAGO, ILL., U.S.A. 




















<3 


Re Te ee ne en ee Te ae SN DES ena ee ee a ee ce a.” ee 





| tice, or important phases thereof, and aid in solving the problems involved 
| in the relations of home and school and community. 


Ht New York. Boston Chicago Atlanta Dallas San Francisco 


Of Interest to Teachers and Parents 


New or recent books that explain modern educational thought and prac- 










MOORE 
PARENT, TEACHER, and SCHOOL—Price $2.00 


GRUENBERG 
OUTLINES OF CHILD STUDY—Price $1.80 


KEITH AND BAGLEY 
THE NATION AND THE SCHOOLS—Price $1.60 


Write for information 


THE MACMILLAN COMPANY 

























Just Published 





A Laboratory Manual in the Psychology of Learning. 
By William Henry Pyle. 
Price $1.50; postage 8c. 


Since the public school has as its purpose the direction and guidance of 
the learning of children, teachers should know as much as possible about 
the nature of the learning process and the laws of learning. If it is worth 
while to have laboratory courses in physics, chemistry, and the biological 
sciences, it must also be worth while to have laboratory courses in the 
psychology of learning. Sciences dealing with human nature have been 
too vague and general both as to the facts and their application. They 
must get closer to the facts; they must rely upon the laboratory. No 
amount of book study about human nature can take the place of even a 
few carefully devised and executed experiments. A Laboratory Manual 
in the Psychology of Learning is published in the hope of furthering and 
facilitating experiments in educational psychology. As now presented it 
represents the results of fourteen years of experimentation. 


CONTENTS: Experimentation, The Mathematical Treatment of Data, Motor Learn- 
ing—Trial and Error Type, Motor Learning, Semi-motor Learning, Retention of Motor 
Learning, A Study of Inhibition, Tachistoscopic Learning, Serial Learning, Associative 
og Verbatim Learning of Poetry, Ideational Learning, Comparative Study of All 
the Experiments. 















Warwick and York, Publishers, Baltimore 














-_ — - 





ne ea! 














ee . Fe eH >.> ee etal aa 


eed EP ETEYELLD ETE TEEP ETE TEES PTET PETER) FTE TFTA PTET TTT PETES FETT 


c2399 


saad se : - 








at BE) | 


1! 





© 

















en TN ee ee -aaiesiaa ean Sem te cr a 


42 Ss | 
a 2 2 





